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SOME FACTORS FOR PUPIL CONTROL MEASURED 
AND RELATED 


HERBERT SORENSON 
University of Minnesota 


Pupil control or pupil guidance depends to a large degree on 
measurements of traits and capacities which have been evaluated in 
terms of each other and against their predictive value in various 
situations. Because individuals find themselves in multiple situations, 
requiring different modes of behavior, measurements for guidance must 
be devised in terms of several abilities. Sampling all traits is impos- 
sible. The most profitable procedure is to measure those factors which 
have proved themselves significant, and some which are thought to be 
major in importance for pupil control. 

In this study tests and measurements are analyzed and related 
with each other in terms of school situations where they are most 
significant. The tests and measurements used were diversified in tksir 
nature, ranging from those testing special abilities to those testing varied 
traits and characteristics expressed in a behavior score. There are six, 
as follows: (1) Intelligence; (2) Behavior; (3) Mechanical interest; (4) 
Mechanical ability; (5) Industrial grades and (6) Academic grades. 
These measures were obtained from over six hundred pupils of the 
7B, 8B, and 9B grades. 

Intelligence was measured by the Terman Group Test of Mental 
Ability. Measurements of behavior were secured by means of a 
rating scale of behavior tendencies, known as Schedule B, which was 
used by Olson.! Mechanical ability and interest were measured by a 
mechanical interest analysis blank and paper form board.? Behavior 


1 Olson, W. C.: ‘‘ The Incidence of Undesirable Behavior Tendencies in Problem 
Children.’”’ Ph. D. Thesis, University of Minnesota. 

2? These two exercises were used in the Mechanical Abilities Research carried 
out under the direction of Prof. Elliot and Paterson of the University of Minnesota 
Psychology Department. 
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was determined by a graphic rating scale of thirty-five items for rating 
traits symptomatic of behavior difficulties. The mechanical ability 
test is made up of geometrical forms which have to be dissected accord- 
ing to suggested patterns. The interest analysis blank uses the Freyd 
technique which calls for an indication of likes and dislikes to numerous 
items. 

To ascertain the student’s average in academic subjects, two 
semesters’ grades for four subjects were averaged. The four subjects 
included in the academic averages were chosen from the following: 
English, language or grammar, mathematics, history, geography, and 
civics. 

Because of the prevalent uniformity of students’ programs, nearly 
every student carried, each semester, the equivalent of one shop or 
industrial credit. The average industrial grade of each pupil was 
determined by averaging all the industrial grades reported for him. 
Reports were made every six weeks. This average was made up from 
the equivalent of two semester hours of work. By comparing the 
methods used, it will be seen that academic averages were found by 
averaging the semester grades of four subjects for two semesters, while 
the industrial grade average was obtained by an average equivalent to 
one industrial subject for two semesters. 

Correlational data were confined principally to the 8B grade. 
The 8B grade is the median grade of the Junior High School, and it is 
very probable that the calculation of correlations for the other two 
grades would have produced results incommensurate with the labor 
involved in such computations. Table I shows the zero-order correla- 
tions of measurements made of two hundred three 8B pupils. 

It should be explained that the intelligence of a pupil within his 
grade was determined by finding his percentile rank on a combined 
ranking within his grade of both intelligence quotient and mental 
age. This was done with the intent of obtaining a pupil’s mental 
status within a grade without overemphasizing either brightness or 
maturity. This plan is consistent with the method of ability grouping 
practiced in the Minneapolis Junior High Schools. 

Table I contains zero-order correlation for the six measurements. 
The highest relationships are found between industrial grades and 
behavior score, academic grades and behavior score, academic grades 
and industrial grades, intelligence and academic grades, and between 
paper form board and industrial grades. The best single measure for 
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predicting academic success is the intelligence test, with scores on 
Schedule B being second in value. The correlation between intelli- 
gence and academic grades is .618, while that between Schedule B 
and academic grades is —.551. Olson found correlations between 
Intelligence and Behavior scores with achievement on the Stanford 
Achievement Test, of .503 and —.568 respectively.! Correlations 
involving Schedule B are negative because the scoring arrangement 
scores undesirable behavior tendencies high, while desirable behavior 
is given a low numeral score. Consequently, the reader should 
interpret a high behavior score as indicative of excessive undesirable 
behavior tendencies, while a lower score indicates fewer undesirable 
behavior tendencies. 


TABLE I.—CoRRELATION OF Srx MEASUREMENTS 





|i]; 2]|a]4]s| 6 


1. Academic grades.................... 487.618) 303|—.551 — .07 





2. Industrial grades................  .487]...... | . 264 447|— .479 .045 
EET ae .329| — .383} .03 
4. Paper form board................  .303} .447| .829)...... |—.170| 207 
5. Behavior scores.................. —.851|—.479|—.383]—.170)...... | (124 
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Behavior ratings were obtained on the 8B pupils on two separate 
occasions three semesters apart. By correlating the behavior ratings 
with the academic grades made during the year of the rating, or with 
those given the following or preceding year, one can obtain the effect 
of time contiguity on these correlations. The behavior ratings will 
be known as the first and second ratings and will be limited to Schedule 
B. The academic averages are designated as first and second marks. 
The first ratings are those which were made in the middle of the school 
year, and the first marks are those earned during the same school 
year. The second ratings were made at the end of that year. In 
Table II are found the r’s between these measurements. 

The correlation of the first ratings with the first marks is —.55 
+.03. For the second ratings and marks, the correlation is also —.55 
+.03. When the first behavior ratings are correlated with the grades 
earned the succeeding year an r of —.51 + .04 is obtained. The r 





1 Olson, W. C.: Op. cit., 
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between the second ratings and the marks earned the previous year 
is —.50 + .04. The correlation between the grades given during 
one year with those given the following year is .76 + .02. This corre- 
lation may not be interpreted directly as a reliability coefficient, 
although it does point to a consistency in grading. Between the 
first and second behavior rating a correlation of .51 + .04 is obtained. 


TaBLE II.—CorRRELATIONS OF BEHAVIOR RATINGS AND ACADEMIC Marks 
DETERMINED FOR Two Periops THREE SEMESTERS APART 





Behavior rating 
Academic marks ~ - niente 
First period |Three semesters later 








tech h es cade hctbveadha ane — .55 + .03 | —.50 + .04 
Three semesters later.................. —.51 + .04 | — .55 + .03 








The correlations between behavior ratings and school grades of 
the same year are higher than the correlations of those separated by a 
period of more than a year. This difference may be due to changes of 
scholarship and behavior which were measured, and secondly, to the 
greater number of teachers who both marked and rated the same pupils. 
The differences in the correlations are not large enough to be significant, 
and consequently, are only suggestive of the above explanations which 
must be qualified as tentative. 

The almost equally high correlation between grades and behavior 
score, separated by over a year, as between these scores when not 
separated by a period, points to their value for prediction. If a time 
factor nullified all relationship, an instrument’s value for prediction 
would be destroyed. 

Three measures correlate about equally high with industrial grades. 
They are academic grades, Schedule B scores, and paper form board 
scores, with correlations of..487, —.479, and .447, which are distinctly 
higher than the correlation between industrial grades and intelligence, 
which is .264. 

The high correlation between academic grades and industrial 
grades offers, in view of the low correlation between industrial grades 
and intelligence, but with a high correlation between academic grades 
and intelligence, the interpretation that personality factors are reflected 
in the school grades. 
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The correlations between mechanical interest scores and the other 
measurements are low. The highest correlation, .207, is between 
interest scores and paper form boards. Correlations obtained by 
Hubbard, in junior high school groups, between the same two variables 
ranged between —.07 and +.22.! All of hercorrelations of mechanical 
interests with other measurements were low to a degree corresponding 
to those obtained in this study. 

To what extent one variable may be a factor influencing high rela- 
tionships between two other factors is not to be detected from these 
correlations. It is quite possible that there are causal factors causing 
spuriously high correlations between any two variables. To remove 
the effect of the third variable from influencing the relationship between 
two, partial correlation was employed. 

The writer, in presenting data involving partial correlation, is 
not unaware of the discussion concerning the significance and actual 
meaning of partial correlation and the limitations of its use. How- 
ever, the evaluations of the partial correlation technique warn chiefly 
against its use for attributing cause and effect to factors, and against 
its use in regression coefficients for determining a dependent variable 
from proportional parts of independent variables. Burks and Kelley 
do not oppose the statistical use of partial correlation to examine 
comparative effects of various factors, but limit their objection pri- 
marily to causal evaluations. It is suggested by them that partial 
correlation does not remove a factor which is entirely discrete to the 
other two factors whose true relation is looked for, but that these 
factors may have a causal relation to the factor partialled out, or to 
unmeasured, remote factors which may effect either or all factors. 
In view of the uncertainty which obtains in the use and interpretation 
of the partial correlation technique, its use was limited to relative 
comparisons of first and second order coefficients. 

For comparable purposes the partial correlations involving behavior 
score, academic grades and other factors, are presented. When 
behavior score is partialled out, the correlation coefficient between 
intelligence and academic grades drops from .618 to .529, and may be 
compared with .264 and .100 for the same relationships involving indus- 





1 Hubbard, Ruth M.: “‘Quantitative Studies of Mechanical Abilities.’”’ Ph.D. 
Thesis, University of Minnesota (Unpublished), p. 59. 

? Burks, Barbara 8S. and Truman L. Kelly: Statistical Hazards in Nature— 
Nurture Investigations. T'wenty-seventh Year Book of the National Society for the 
Study of Education, Part I, 1928, pp. 11-16. 
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trial grades. The influence of behavior score on the two are apparently 
greater in case of industrial grades, but in predictive terms the greater 
reduction of the lower coefficients is not larger than the comparatively 
smaller reduction of higher coefficients. The relation between aca- 
demic grades and behavior score when intelligence remains constant 
falls from —.551 to —.433. For similar relationships involving indus- 
trial grades the zero and first order coefficients were —.479 and —.425. 
Apparently, intelligence is the more influential factor in the relation- 
ship of behavior score and academic grades than between behavior 
score and industrial grades, a condition which is to be expected in 
view of the high correlation between intelligence and academic grades. 
When intelligence was partialled out, the correlation between these 
same variables dropped only to —.425. Intelligence correlates .264 
with industrial grades. This coefficient drops to .100 by a first order 
partial which holds behavior score constant. A reduction from .447 
to .423 in the relationship of paper form board and industrial grades 
when behavior is held constant shows that behavior is not a major 
factor in the correlation of paper form board and industrial grades. 
It appears in general from these correlations that behavior tendencies 
as measured by Schedule B do not affect materially the relationships 
between these several factors and industrial grades. 


TaBLeE IIJ].—PartTiaL CORRELATION OF VARIOUS MEASURES WITH BEHAVIOR 
Score HEeutp ConsTANT 
(In Parentheses Are Placed the Zero Order Correlation of These Measurements) 














| 
, Paper form | : 
| Intelligence | ‘board | Industrial 
ee 
Academic grades.................-.. | .529 (.618) | .255 (.303) | 304 (.487) 
Serer er ke | 
Paper form board...................| .290 (.329) 





The extent to which school marks measure the adequacy of a 
pupil’s school adjustment and thus to more or less extent his general 
adjustment is not known. Ina general way we judge a pupil’s success 
or failure in school by his marks. We further know that there are 
quite definite relationships between school marks and elimination from 
school and that school marks of one grade level offer one of the best 
criteria for predicting school success in subsequent grade levels. How- 
ever, no one would maintain by the evidence in present literature that 
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school marks are the best index to personality traits and difficulties 
which tend toward abnormal and unsocial behavior. But on the 
other hand most case studies of delinquent children mention poor 
scholarship as one of the factors of mal-adjustment. Its relationship 
to the delinquency is probably symptomatic rather than causal. If 
that is true, poor scholarship is a result of mal-adjustment rather 
than a causal contributor to it. 

In order to bring out the relation of behavior score to scholarship 
(but with no attempt to ascribe causes or effects), scores on Schedule 
B for the 8B grades were arranged according to academic grades; 
and similarly, average grades according to behavior score in Table IV. 
By considering the range for behavior score medians from 47.5 to 
88.3, one has a range that is 2.47 times the probable error of the dis- 
tribution. The range of academic grade medians ranging from 71 
to 89.3 and corresponding with a range of 40 to 115 in behavior score 
is 2.54 times the probable error of the grade distribution. Of the 
ninety-six pupils who have grades above the average only seventeen 
are definitely above the average behavior score and of the eighty- 
three pupils whose behavior scores are 70 or above (67.3 is the average) 
the same seventeen are the only ones above the average in scholarship. 
Thus only 17.7 per cent of the pupils in the upper half in scholarship 
are in the upper half in behavior, and of the eighty-three pupils who 
are above the 62.1 percentile in behavior only 20.5 per cent are above 
in the average in achievement. The Schedule B score is thus of 
considerable value in predicting school success as indicated by school 
marks. This relation is more remarkable in absence of another very 
significant predictive factor, the intelligence test. 

To ascertain a combination of factors which would give the best 
method of predicting school success, multiple correlations were ob- 
tained. The largest coefficient of correlation by zero order which 
included industrial grades, was —.479 between industrial grades and 
behavior scores. By multiple correlation a coefficient of .735 was 
obtained by correlating, industrial grades with behavior scores, paper 
form board, and intelligence. This coefficient is distinctly larger 
than .479 and if errors in all the measurements could be reduced the 
multiple correlation coefficient probably couid be increased to a point 
where its value for prediction purposes would be enhanced considerably. 

A multiple coefficient of .712 is obtained between academic grades 
and behavior score, paper form board, and intelligence. The multiple 
coefficient of correlation between academic grades and behavior score 
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and intelligence is .705 which, because of its nearness to .712 and its 
containing one less variable, is probably the more usable relationship 
for purposes of predicting academic scholarship. By correlating 
academic grades with behavior score, intelligence, and industrial 
grades, a correlation coefficient of .766 is obtained. 

The zero order correlation between intelligence and industrial 
grades is .264. The numerical value resulting from applying the for- 
mula, 1 — +/1 — n?, to this amount is .035 which indicates the extent 
to which, by the aid of a correlation of .264 one’s individual prediction 
is better than chance. The multiple correlation of industrial grades 
with intelligence, behavior score, and paper form board is .735. The 
inclusion of behavior score and paper form board in the correlation 
of intelligence with academic grades raises the zero order coefficient 
of correlation from .264 to .735. When the criterion for individual 
prediction is applied to the coefficient .735, the quantity .33, is obtained. 
Although its value is not of such a magnitude that it removes one 
from pure chance to infallible prediction, it does seem large when 
compared with .035. 


TaBLE IV.—RELATIONSHIP OF SCHOLARSHIP TO BEHAVIOR ScorES—8B GRADE 





Behavior score schedule B | | 















































: Median 
Academic ae ae ee ===} Peteis | behavter 
grades ey | score 

120-129) 110-119) 100-109 /90—99/ 80-89 reno 50-59 — 
| | 
93-96 ws eae os Poteet weal seckh BE Pee a ee 47.5 
89-92 a we se ee oe ee 7 7 18 52.9 
85-89 x oy ne 1 2 4 | 4 14 7 32 56.4 
81-84 1 1 2 1878 4 31 61.9 
77-80 sa be 1 |....) 7 |14 | 12 |as | 5 54 65.8 
73-76 Pe 1 1 5 6 9 | ll 7 a at 40 72.2 
69-72 1 2 1 1 5 4 3 1 J sees 18 82.0 
65—69 a 7 sta 4 3 er 1 er y 88.3 
61-64 2 1 WE eT PP Pek Peet Pek Eo 1 112.5 
Totals 1 4 4 12 26 36 43 59 33 218 
Medians 71 71 | 77 73.8) 76.3) 78.4) 80.2) 83.2) 89.3 80 
| | | 
SD of marks is 7.2. Average of marks is 81.1. 
SD of behavior scores is 16.55. Average of behavior score is 67.3. 


Academic grades correlate .618 with intelligence. The relationship 
of those two factors is closer than that of any other two which are 
included in this study. When .618 is substituted for r in the equation 
1 — +/1 — r?, .216 is obtained. The multiple correlation of academic 
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grades with behavior score and intelligence is .705 and when sub- 
stituted as .618 was substituted above, the result is .29. When 
behavior score is included with intelligence in correlation with academic 
grades, individual prediction is raised from .216 to .290 which, in 
terms of gain, represents an increase of 34 per cent. An increase of 
34 per cent seems large but in view of the rather small quantity, .216, 
on which the increase was calculated, the gain does not approach 
practical significance. For purposes of individual prediction, a 
correlation coefficient of .87 or more is necessary to obtain a basis for 
prediction which is 50 per cent better than chance. A coefficient 
that offers the nearest approach to that is a second order multiple 
correlation, .766, obtained by correlating academic grades with 
behavior score, intelligence, and industrial grades. When .766 is 
substituted in the formula for the root mean square correctness of 
estimation, .36 results. 


TaBLE V.—MULTIPLE CORRELATION BETWEEN INDUSTRIAL GRADES, ACADEMIC 
GRADES, AND OTHER MEASUREMENTS 
(In Parentheses Are Placed the Zero Order Correlations According to the Sequences 
in Which the ‘‘Other Measurements’”’ Are Mentioned) 





























' Behavior score | Behavior score 
Behavior score paper form paper form 
| = eae - board academic board 
| grades intelligence 
Industrial grades........ o (—.479) : (—.479) (—.479) 
| .719( .447) | .744( .447) | .735( .447) 
| ( .487) ( .264) 
| Intelligence | Behavior scores 
_ Behavior score | behavior score |_ intelligence 
| intelligence | paper form | industrial 
| board | grades 
| 
| —— = 
Academic grades..............| | (—.551) | (—.551) 
| .705 (—.551) | .712( .618) | .766( .618) 
| ( .618) | ( .303) ( .487) 





It appears from these data that behavior score is about as signifi- 
cant as any of the other measurements according to the magnitude 
of its effect on their relationships in multiple and partial correlations 
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and according also to the extent that it correlates with these measure- 
ments. It correlates less with academic grades than does intelligence, 
but its correlation with industrial grades is greater. That relation 
does not lessen the value of the intelligence test but probably points 
to a situation where the traits which reflect themselves in behavior 
scores also reflect themselves in school marks. 
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THE UNRELIABILITY OF RELIABILITY 
COEFFICIENTS 


EDWARD A. LINCOLN 


Harvard University 


In the whole field of scientific study in psychology and education 
there is no more important problem than the question of the measure- 
ment of the traits and abilities with which these studies deal, because 
effective research and experiment are not possible without means for 
the accurate measurement of the various factors involved. This fact is 
generally accepted, and good research workers take the utmost care in 
the selection of their measuring devices and procedures. One of the 
best known and most widely used methods for testing the accuracy and 
consistency of measurements is the reliability coefficient technique, 
which consists in finding the correlation between two sets of measure- 
ments of the same trait taken on the same group of subjects. It is 
commonly believed that a high reliability coefficient is a guarantee 
of accurate measurement, and that a low reliability coefficient indicates 
a lack of accuracy. The purpose of this paper is to show that this 
notion is, to some extent, at least, a mistaken one. 

The data for this inquiry were obtained in the course of the Harvard 
Growth Study, in which project several thousand public school and 
institutional children are being measured annually over a period of 
twelve years. Early in this project it became apparent that the 
methods used in obtaining the physical measurements were not entirely 
satisfactory, and the writer undertook an investigation of the relia- 
bility of those methods. The results of this investigation led to a revi- 
sion of the measuring technique 2mployed and a marked increase in 
the reliability of the measurements obtained. The story of this work 
has been told elsewhere.' The present paper is written for the purpose 
of pointing out some rather queer and striking facts concerning the 
reliability coefficients which were found in connection with the previous 
study. 

Two reliability studies were carried out, one to test the reliability of 
the physical measurements obtained under the old technique, and the 
other to discover the reliability of measurements obtained with the 
revised procedures. In each case the data were obtained by re-measur- 





1E. A. Lincoln: The Reliability of Anthropometric Measurements. Journal 
of Genetic Psychology. Vol. 38, 1930. 
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ing all the children in one or two rooms immediately after the original 
series of measurements had been taken. The cards on which the first 
measurements had been set down were collected, and new cards were 
given out for the second series of measurements. Thus none of the 
measurers when he took the second set of measurements had the slight- 
est notion as to the data already recorded. ‘The first study was carried 
out with forty boys and fifty-four girls in a seventh grade, and the 
second was carried out with sixty-seven boys and fifty-two girls in the 
sixth and seventh grades. 


TaBLE I.—ReEsvULTS OF THE First RELIABILITY STUDY 





Boys Girls 





Differences in mm. 


Differences in mm. 
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Md} Qi | @3| + Ql | 

} / 

— — : =f 

Standing height.............. 13.1 | 1.1 | 4.80 991 2.8 | 1.6 | 4.88] .998 
Sternal height................, 6.2 | 2.5) 9.5 | .994) 5.0 | 2.9 | 9.3 | .995 
Sitting height................ 3.8 | 1.9 | 6.5 | .983 3.1/1.9 | 6.6 | .989 
Head length..................| 1.4 | 0.7] 2.3 | .914) 1.8 | 1.2 | 2.8 | .786 
IN vio ssccics ed san 1.7 | 0.8 | 2.8 | .591) 1.7 | 1.0 | 2.6 | .694 
Chest depth..................| 7.0 | 5.0 |14.0 | .463)11.5 | 5.3 17.2 | .595 

Chest width.................. 6.5 | 4.5 | 5.5 | .830| 8.0 | 3.1 11.8 | .792_ 
Leg length................... 4.7|2.4|7.5| .986| 4.8 | 2.3 | 8.3 | .992 
Trunk length 6.8 | 3.0 [11.4 | .940 5.3 | 2.7 | 9.5 | .942 

















TasBLeE II.—ReEsvuLtTs oF THE SECOND RELIABILITY STUDY 


























Boys Girls 

ee | ‘ 

Differences in mm. Differences in mm. 

l aa 

Md | Ql Q3 | + Md | Q1 | Q3| r 
Standing height.............. 3.4|1.5 | 5.4 997 3.1/| 1.5 | 4.9] .981 
Sternal height................ 3.9 | 2.3 | 6.7 | .992) 3.2 | 1.7 | 6.0 | .993 
re 3.9 | 2.6 | 5.8 | .987) 3.5 | 2.0 | 5.8 | .967 
SIL 5 cc nedecccseveces 3.6 | 1.9 | 5.9 | .866) 4.1 | 2.4 | 6.3 | .910 
Chest width........ 3.9 | 1.7 | 6.5 | .846) 3.8 | 1.9 | 5.8 | .927 
Gch 46s od Web 6'e so 3.1} 1.5 | 4.9 | .922) 2.7 | 1.4 | 4.5] .974 
as hs < dhe een ceded 3.5 | 1.8 | 5.1 | 7999) 4.4 | 2.4 | 6.7 | .993 
Sree Perea 5.4 | 2.9 | 9.4 | .928) 5.5 | 2.6 | 9.1 | .970 
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The reliability of the measurements was studied in two ways. 
The first was the method of reliability coefficients; that is, Pearson 
product moment correlation coefficients were calculated to show the 
relationship between the first and second series of measurements for 
each trait. The second method was the study of actual differences. 
The difference between each corresponding pair of measurements was 
found, then these differences were distributed, and the median and 
quartile points were computed for each distribution. This procedure 
gave a very clear and unmistakable indication of reliability, for if the 
median difference was a considerable one, it was very certain that the 
measurements were inaccurate. 

The findings are presented in the accompanying tables. On the 
chance that there would be sex differences, the data for boys and the 
data for girls were treated separately, and so the coefficients and differ- 
ences are reported by sexes, although there are only a few instances 
where the sex differences are probably significant. 

Examination of either table shows that while there is a general 
agreement between the size of the median difference and the reliability 
coefficient there are some very striking exceptions. In Table I, for 
example, the median difference for boys’ sternal height is 6.2 mm., and 
the reliability coefficient is .994, while the median difference for boys’ 
head width is 1.7 mm., with a reliability coefficient of only .591. The 
same inconsistency is found in the results for the girls. Here the 
median difference in sternal height is 5.0 mm., with a reliability of .995, 
while in head width the median difference is 1.7 mm., and the coefficient 
is .694. These variations are all the more noticeable when the upper 
quartiles of the differences are compared. For the sternal height of the 
boys the Q3 is 9.5 mm., while for the head width it is only 2.8 mm., 
In the case of the girls the upper quartiles are 9.3 mm. for sternal height 
and 2.6 mm. for head width. Trunk length is another measurement in 
which there is marked disagreement between the reliabilities indicated 
by the different measures. For the boys, there is a median difference 
of 6.8 mm., with an upper quartile of 11.4 mm., which means that in a 
quarter of the cases there was a difference between the first and second 
measurements of more than a whole centimeter, yet the reliability 
coefficient is .940. In the case of the girls, there is a median difference 
of 5.3 mm., the upper quartile is 9.5 mm., and the reliability coefficient 
is .942. 

In the second table the discrepancies are not so large, because the 
reliability of the second method of measuring was much greater than 
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the first. There are, however, several indications that the reliability 
coefficients are not entirely trustworthy. In the case of sternal height, 
sitting height, and chest width of boys the median difference between 
the two measurements is 3.9 mm., yet the reliability coefficients for 
these measurements are .992, .987, and .846, respectively. Trunk 
length for both boys and girls shows a highly satisfactory reliability 
coefficient in spite of the fact that the median difference in each sex is 
well over five mm., and the upper quartile of the differences is more 
than nine mm. 

It is very clear that if the coefficients alone had been taken as the 
criterion of reliability, the first method of measuring would have been 
accepted as satisfactory for nearly all the traits considered, in spite of 
the fact that about one-half the errors were substantially greater than 
they needed to be, and from ten to twenty-five per cent of the errors 
were of a very serious nature. This state of affairs was easily remedied 
once it was known, but it became known largely through a study of 
actual differences, and not through the conventional method of study- 
ing reliability. 

Although this study was carried out by the use of physical measure- 
ments, it is probable that its most important implications for educators 
and psychologists are in regard to standardized tests. If reliability 
coefficients are not trustworthy when calculated for physical traits 
measured by trained operators using highly accurate instruments, they 
are most certain less to be depended upon when the reliability of 
tests is in question. There is one very good illustration of this point in 
the manual of a well and favorably known group test. The reliability 
was found to be .90 when the test was given to the same pupils on the 
morning and afternoon of the same day, but the afternoon scores 
increased twelve points, on the average. That is, the pupils were a 
whole year older mentally in the afternoon than they were in the 
morning. This may be reliability in the technical mathematical 
sense, but it is not the sort of reliability which will satisfy the practical 
school man, or which ought to satisfy the careful scientific worker. 
Clearly, then, the meaning of these findings is that any satisfactory 
study of the reliability of a measuring instrument or a measuring 
process must be more comprehensive than the simple consideration of 
reliability coefficients alone. 








BASIC CONSIDERATIONS FOR VALID 
INTERPRETATIONS OF EXPERIMENTAL STUDIES 
PERTAINING TO RACIAL DIFFERENCES 


ROBERT P. DANIEL 
Virginia Union University, Richmond, Virginia 


Further development in the comparative psychology of races 
necessitates a clear-cut concept of the cautions involved in racial test- 
ing In recognition of this fact, therefore, this study proposes to ana- 
lyze the basic considerations for valid interpretations of experimental 
studies pertaining to racial differences. The plan is to present these 
basic considerations grouped with reference to three “major condi- 
tions” which are significant in the light of experimental evidence; 
and to present a resultant check list in the form of questions which one 
may use as criteria for accepting an experimental study as valid for 
generalizations regarding racial differences. 

In this analysis three limitations should be noted: (1) We are not 
concerned with studies of all phases of racial differences, but with 
studies related to what are called differences in mental ability :(2) we 
are concerned, not with evaluating tests and techniques as such, but 
with the basic conditions of an experimental set-up of an investigation 
leading to generalizations regarding racial comparisons; (3) the various 
aspects of the criteria will be evaluated chiefly with reference to their 
relevancy in the interpretation of the testing of Negroes. 

No factor has injured the mental testing movement more than the 
dogmatic and unscientific generalizations stated as conclusions regard- 
ing racial psychology.“ The analyses of the Army Alpha examination, 
which received such widespread dissemination, stimulated much testin 
for racial comparisons. Unfortunately, some of the efforts so stimu- 
lated seemed motivated by the desire to justify race discriminations 
which antedated the testing. Usually little or no attention was given 
to scientific experimental set-up; all differences were called differences 
in innate mental ability. 

Since the chief point reported in the results was the difference in 
the median score intelligence quotient of the groups compared, a 
summary article of the studies made during the period following the war 
may be expected to end thus: ““These studies taken all together seem to 
indicate the mental superiority of the white race . . . Altogether it 
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may be said that the investigations recognize that these experimental 

results are crude and so they must be taken tentatively. ’’! cc 
However, questions of random sampling, the validity and relia- of 


bility of the tests and techniques, extent of variability in scores, sig- 
nificance of social status, inequalities of educational opportunities as 
they affect scores, and the like operated to challenge the validity of B 
many studies for scientific acceptance. 

As a result the problems and methods of investigating comparative 
abilities in races were re-evaluated. Some psychologists adjusted 
their techniques and have been more cautious in their interpretations to 
conform to ever developing scientific criteria. Nevertheless, there 
still appears in the literature from time to time reference to some of 
these early studies as evidence, although their pertinency as judged 
by present criteria is generally disclaimed. As in the board from 
which the nails have been extracted, the holes remain! 

So significant have been the studies made in recent years in cog- 
nizance of the influencing considerations mentioned above that an 
article of summary appearing in 1929, in marked contrast to the sum- 
mary of 1925 previously quoted, concludes: “It may be correctly 
concluded that the concensus of competent scientific thought, con- 
templating the inability of mental testers to define intelligence, the 
inadequacy of all attempts to take such factors as education, social 
status, and language into proper consideration, and the deficiencies of 
testing conditions, finds no proof of racial inferiority or superiority and 
eliminates the usual methods of determining such standing from the 
field of scientific usefulness. ’’? 

In a synthetic statement, Garth conceived of the problem of race 
psychology as “‘an effort to (1) measure racial mental character and 
behavior, which are (2) determined through racially inherited nervous 
mechanisms, (3) involving cautiousness in taking into account social 
status, and all facts of nurture and education, until we are (4) able to 
perfect tests that will measure innate abilities in spite of the presence 
of nurture; (5) to take care to secure really random samplings of the 
races studied, (6) to consider that possibly a ‘blanket’ conception of 
‘acial superiorities is in error, and finally that (7) often the upper limit of 
a group is as important as an average performance. ’’® 
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1Garth, T. R.: A Review of Racial Psychology. Psychological Bulleiin, Vol. 
XXII, 1925, p. 359. ; 

2 Yoder, D.: Present Status of the Question of Racial Differences. Journal of 
Educational Psychology, Vol. XIX, 1928, p. 470. 

3 Garth, T. R.: Op. cit., p. 349. 
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al This statement is highly suggestive of the necessity of a painstaking 
consideration of several elements in the experimental set-up, evidence 
a- of which should be provided in the report. 
g- 
as Masor Conpition I. Have THE ENVIRONMENTAL OPPORTUNITIES 
of BEEN APPROXIMATELY THE SAME FOR ALL THE INDIVIDUALS COMPARED? 
According to Colvin, ‘‘No intelligence test is a valid measure of 
a innate mentality unless it is applied within a group whose members J 
eG 


have had identical or very similar opportunities for gaining familiarity 
to with the materials of the test, and whose members have had not only 


ae the same opportunities to learn but the same desire to learn.’’! 

of Arlitt made a study of social status. When comparisons were 
ed made between the races without regard to status the difference between 
m 


the medians of the whites and Negroes was 23.1 points. But when the 
Negroes were compared with the whites of the same social status, the 
&- difference was 8.6. 

She reports a difference of 33.9 points between the medians of 
children of ‘‘inferior’ and ‘‘very superior” social status of the same race 
ly and attending the same grades in the same school. The difference 








- between the Negro, Italian, and white children studied were not as 
he great as those between children of the same race but of different social 
al - status. ‘‘Race norms,” she concludes, “which do not take the social 
of status factor into account are apt to be to that extent invalid.’’? 
ad Besides social status, there is the disparity traceable to unequal 
a educational exposures furnished Negro and white children. Many 
investigators naively attempt to take cognizance of this fact by com- 
a paring pupils on the same grade level. Do we thus have comparable 
ad groups? Most assuredly not. Differences in education are not 
- eliminated by comparing a Negro child in the third grade of a poorly 
al equipped school, over-crowded classroom, with a poorly prepared, 
to underpaid teacher, for an abbreviated school term with a white pupil in 
“ial the third grade of a well equipped school, “standardized” class, with a 
he well paid teacher of specialized training, for a nine months term. 
of Since both are in the mythical same grade, such a Negro pupil who 
of fails to reach the norms established by “‘similar’’ white pupils in ‘‘intelli- 
gence tests’ is declared mentally inferior. Selected contrast? 
ol. 1 Colvin, S. S.: The Present Status of Mental Testing. Educational Review, 
Vol. LXIV, 1922, p. 335. 
of 2 Arlitt, A. H.: On the Need for Caution in Establishing Race Norms. Journal 


of Applied Psychology, Vol. V, 1921, p. 183. 
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Indeed, but not at alluncommon. As a matter of fact, since Washing- 
ton, D. C., and St. Louis, Missouri, are the only cities in the country 
maintaining segregated school systems which are considered function- 
ing on the same standards for white and colored, an unequal educational 
exposure holds in varying degrees in all segregated school systems. 
Long! and Bagley? show very pointedly that the greatest disparity in 
test scores is found in sections providing separate school opportunities. 

Even group comparisons of Negroes and whites in the northern 
cities with an appreciable number of Negroes in the mixed schools 
cannot now be considered on their face value. The Negro attendance 
in the North has practically doubled in the past twenty years and this 
increase has come from the South. The pupils in these schools, 
therefore, include a great influx from the South who are retarded 
because of lack of background resulting from the deficient educational 
facilities of the South. 

Washington‘ shows the striking contrast in the percentage of 
retardation of the Negro pupils from the southern states in the Detroit 
public schools as compared with thosefrom Michigan. From Michigan 
the per cent of retardation was 4.76; from Mississippi 25.0, North 
Carolina 21.63, Georgia 21.3, Virginia 20.0, Tennessee 19.77, South 
Carolina 19.7, and others in like degree. 

Although segregation of Negro children in schools is not recognized 
legally in the northern states, many schools which they attend are 
primarily Negro schools due to the effect of social factors of residential 
districts. With reference to this circumstance, Payne believes that 
“the various factors leading to segregation do not allow the Negro 
to be exposed to the same educational or cultural situations to which 
the whites are exposed in the North. Moreover the special treatment 
is not lost in its effect. It serves to create an attitude of mind in both 
the whites and the Negroes that enforces totally different educational 
effects. ’’® 


1Long, H. H.: On Mental Tests and Racial Psychology—A Critique. Oppor- 
tunity, Vol. III, 1925, pp. 134-138; also Race and Mental Tests. Opportunity, 
Vol. I, 1923. 

? Bagley, W. C.: ‘“‘ Determinism in Education.” Pp. 73ff.; also Army Tests and 
the Pro-nordic Propaganda. Educational Review, Vol. LX VII, 1924, pp. 179-186. 

’ Payne, E. G.: Negroes in the Public Elementary Schools of the North. 
Annals of the American Academy of Political and Social Science, Vol. CXL, 1928, 
pp. 224-233. 

4 Washington, F. B.: “‘The Negro Student in Detroit.” Survey by Detroit 
Bureau of Government Research, 1926, p. 4. 

5 Payne, E. G.: Op. cit., p. 227. 
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It is significant that an increasing number of investigators are 
calling attention to the environmental factors. Davis thinks that 
“there is great doubt if any intelligence test as yet devised is well 
adapted to the Southern Negro . . . When and only when we have 
legalized the character as well as the amount of education possessed 
by the colored and white races can we draw distinct lines between them 
with reference to intelligence.’’' He sees a definite relation between 
the amount of training and intelligence scores in the high positive 
relationships between the extent of the scores below the standard in the 
Terman test and the variation of from 8 to 18.5 months less in school 
than the standard. 

In a study of abilities as measured by a group test Willard con- 
cludes, ““The presence of growth in considerable quantities for all ages 
and all classes gave evidence of the effect of the environment irrespec- 
tive of native capacity ... The function of the environment in 
determining test scores is important enough to invalidate many 
comparisons that might be made between groups in different 
schools, or groups tested at different times, to determine relative 
mental ability.’’? 

Gordon’s experiments led to the finding that, ‘‘It is quite evident 
that, although the mental tests do undoubtedly test some kind of 
ability or abilities, such abilities are not developed without schooling 
or its equivalent, and as a consequence the tests do not evaluate them 
apart from schooling, except perhaps in the case of children under six 
or seven years of age.’”® 

Although Gordon makes no generalization regarding children 
below six or seven, Woolley makes claims below those ages, ‘‘a certain 
part of what we later call ‘level of intelligence,’ may be due to the 
opportunities to learn given to young children. Very young children 
may show striking differences of intelligence quotient when placed in a 
very superior environment. ’’* 





1 Davis, R. A.: Some Relations between Amount of School Training and 
Intelligence Among Negroes. Journal of Educational Psychology, Vol. XIX, 1928, 
p. 127. 

2 Willard, D. W.: Native and Acquired Ability as Measured by the Terman 
Group Test of Mental Ability. School and Society, Vol. XVI, 1922, pp. 750-786. 

’ Gordon, Hugh: ‘‘ Mental and Scholastic Tests among Retarded Children.” 
London Board of Education Pamphlet, No. 44, 1923, p. 92. 

4 Woolley, Helen T.: The Validity of Standards of Mental Measurement in 
Young Childhood. School and Society, Vol. X XI, 1925, p. 476. 
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That difference in school training and social conditions contributes 
to divergence in test results is considered also by Pyle,' Sunne,? 
Terry,*® Burkhard,‘ Gregg,® and others. 

We conclude, therefore, that differences between Negro and white 
children in the North, just like differences between Negro and white 
children in the South, and just like differences between Negro children 
in the North and Negro children in the South, may be explained on the 
ground of differences in environment and education. 


Masor ConpiTIon 2. Dors THE TESTING SET-uP PERMIT VALID 
RaAcIAL COMPARISONS? 


Several investigators indicate varying situations which operate 
to limit the range of applicability of the interpretation of the results 
obtained therefrom. These we shall now consider. 

A. Comparisons Based on Group Testing.—Valid racial comparisons 
are doubtful if the procedure utilizes group testing and tests of the 
usual type. For race comparisons no worth can be attached to national 
norms for whites. The testing of groups has revealed that even when 
different groups of whites are compared with the same group of Negroes 
the differences vary. 

Peterson states that he has become skeptical of the results of group 
testing in race psychology that he ‘‘is inclined to question all data so 
derived, and to recommend for race comparisons only individual tests 
of a nature and under conditions affording constant stimulation by the 
tester under standardized conditions. ’’® 

B. Tests Varying in Results—Several tests of widespread use are 
open to the criticism of being undesirable for racial comparisons. 
There is need for caution in interpretation so long as the instruments of 


1 Pyle, W. H.: The Mind of the Negro Child. School and Society, Vol. I, 1915, 
pp. 357-360. 

?Sunne, D.: Comparison of White and Negro Children in Verbal and Non- 
verbal Tests. School and Society, Vol. XIX, 1924, pp. 469-472; also Comparison 
of White and Negro Children by the Terman and Yerkes-Bridges Revisions of the 
Binet Tests. Journal of Comparative Psychology, Vol. V, 1925, pp. 209-220. 

3 Terry, R. J.: The American Negro. Science, Vol. LXIX, 1929, pp. 337-441. 

4 Burkhard, R.: Blockhead vs. Nordic (Racial IQ’s). Education, Vol. XLVI, 
1926, pp. 494-501. 

5 Gregg, J. E.: The Comparison of Races. Scientific Monthly, Vol. XX, 1925, 
pp. 248-254. 

* Peterson, J.: Methods of Investigating Comparative Abilities in Races. 
Annals of the American Academy of Political and Social Science, Vol. CXL, 1928, 
p. 179. 











ID 


te 
ts 


1s 
he 


>n 
es 


a OD 


1€ 





Studies Pertaining to Racial Differences 21 


measurement show so much variation and unreliability. Under this 
condition the relative standing of Negro and white children will depend 
upon the incidence of chance in the test selected. 

Sunne raises the issue of the variability of results from different 
tests in the statement, “‘If these children are compared as to amount of 
their scores according to the different scales, the sex differences between 
the white children according to Point scale ages are greater than race 
differences at chronological ages ten, eleven, twelve, and thirteen, and 
similarly the sex differences of the Negro children at eight, nine, ten, 
eleven, and twelve.’’ Other comparisons are given showing that there 
is “greater variability of racial and sex differences at the different age 
levels, and also that the amount of this variability depends on the scale 
used. ’’! 

Then too, there is the further point of the need of basing compari- 
sons on the results of the same measures. That there is variation in 
intelligence quotients depending upon the test used seemed not to occur 
to some individuals who will compare the 1Q’s of Negroes obtained 
from one test with IQ’s of whites from another test. 

Miller? reports that the mean intelligence quotient of a group of 
pupils varied from 117.5 to 139.0 in ten different tests which he gave. 
Kefauver found that ‘‘an intelligence quotient of fifty-two obtained 
from one test has the same meaning as eighty-seven on another, and the 
intelligence quotients one hundred thirty-six and one hundred sixty- 
nine have the same meaning on two different tests.’”* Erroneously 
believing that the intelligence quotient is a constant measure which has 
the same meaning for all mental tests, many individuals would have 
readily declared a Negro child who had obtained the IQ of fifty-two on 
the first test undoubtedly mentally inferior to a white child who had 
obtained the IQ of eighty-seven on the second test. 

Bishop experimenting with the Otis Group Intelligence Scale, Form 
A, reports, ‘“The results of the study seem to indicate that the scores 
on this group intelligence scale are very greatly influenced by the 
teaching the pupils have received. In every group and in almost 





1Sunne, D.: A Comparative Study of White and Negro Children. Journal of 
Applied Psychology, Vol. I, 1917, p. 73. 

2 Miller, W. S.: The Variation and Significance of Intelligence Quotients 
Obtained from Group Tests. Journal of Educational Psychology, Vol. XV, 1924, 
p. 360. 

3 Kefauver, G. N.: Need of Equating Intelligence Quotients Obtained from 
Group Tests. Journal of Educational Research, Vol. XIX, 1929, p. 93. 
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every individual case the evidence is clear. The fact that a pupil can 
double his score on an intelligence test as a result of a few lessons upon 
similar material—not identical material—makes the teaching factor 
seem particularly important. Something other than native intelligence 
seems certainly to be in evidence. ’’! 

Lacy? finds that with the Binet test the colored children show slower 
mental growth than the whites, but that the findings on the Otis 
self-administering tests do not substantiate that idea. Davis* finds 
that scores on the Terman Group test are influenced significantly by the 
length of time in school. Peterson,‘ for an entirely distinctive reason, 
does not think that the Binet test nor many other individual tests used 
are satisfactory for racial comparisons. 

C. Reading Deficiency a Negating Factor.—Innumerable studies 
have shown the handicaps of language on the part of children from 
non-English speaking homes. Consequently psychologists have been 
cognizant of this circumstance in interpreting the scores of children of 
foreign descent. Reading inabilities are of similar handicap and render 
insignificant corresponding test scores. ‘To the extent that the tests 
used as an instrument of measure involve reading abilities, to the same 
degree are they useless for racial comparisons of innate mental ability. 

The consideration of the positive correlation between reading 
and intelligence may be of value in individual diagnosis, but for racial 
comparisons is irrelevant since reading is a part of that learned 
circumstance which is not comparable for white and colored children. 

Negro children tested by Witty and Decker evidenced poorest 
scholarship in reading and language usage. One could hardly attribute 
this deficiency to lack of intelligence in view of the fact that these 
‘children approach the standards for white children more nearly in the 
history and literature tests than in any of the other tests. It seems 
reasonable that mental ability is a potent prerequisite for success in 
these subjects.’’> The authors consider reading deficiency as one of the 
probable operating factors which should cause us to turn to sources 





1 Bishop, O.: What Is Measured by Intelligence Tests. Journal of Educational 
Research, Vol. IX, 1924, p. 34. 

2Lacy, L. D.: Relative Intelligence of White and Colored Children. Elemen- 
tary School Journal, Vol. XX VI, 1926, pp. 542-546. 

3 Davis, R. A.: Op. cit., p. 127. 

4 Peterson, J.: Op. cit., p. 179. 

5’ Witty, P. A. and A. I. Decker: Comparative Study of Educational Attain- 
ment of Negro and White Children. Journal of Educational Psychology, Vol. 
XVIII, 1927, p. 500. 
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other than mental inferiority for the explanation of the educational 
status of the Negro. 

A very definite study of this problem was made by Thompson! who 
found that colored children reared in Chicago and white children reared 
in the same section showed practically no difference in comprehension 
in reading; but the colored children reared in the South who were at the 
time receiving some of their school training in Chicago scored lower on 
the average than did the other groups. 

An experiment by White shows that scores in a test of general 
intelligence can be increased by reading drills. His conclusions are 
‘“‘(1) The net increases in scores made are due to the educational 
influence of the reading drills, (2) The net increases in scores made are 
all significant, (3) The net increases in scores suggest that tests which 
require reading may be unreliable measures of natural intelligence. ’’? 

Lacy gave standard tests which showed the colored children to be 
below normal in reading. He beleived that the reading deficiency 
accounted in part for the average IQ of those taking Otis Self-admin- 
istering test being lower than the average IQ of those given the Binet 
test.® 


Major ConpiTIon 3. ARE THE DATA PRESENTED AS THE BASIS FOR 
COMPARISON SIGNIFICANT WHEN SUBJECTED TO STATISTICAL 
TREATMENT FOR RELIABILITY AND VARIABILITY? 


Since the issue is one of generalization regarding the mental ability 
of a race, proper sampling, the extent of overlapping of the groups, 
and verification of conclusions are vital points. 

A. Sampling.—If an investigator is concerned chiefly with a 
study of certain conditions pertaining in a particular, limited situation, 
there may probably be no need for random sampling. But in an 
investigation concerned with the comparison of mental ability of races, 
the sampling must be random, fair, and representative. 

It is a common practice for investigators to assume that they have 
comparable samples when they select a group of Negro children of a 





1 Thompson, C. H.: “‘Study of the Reading Accomplishments of Colored and 
White Children.’”’ Unpublished master’s thesis, Department of Education, 
University of Chicago, 1920. 

2 White, W.: The Influence of Certain Exercises in Silent Reading on Scores 
in the Otis Group Intelligence Tests. Educational Administration and Supervision, 
Vol. IX, 1923. 

®Lacy, L. D.: Op. cit. 
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given grade and white children of that same grade. In a segregated 
school system one cannot get fair samplings at a similar grade level of 
white and Negro children, nor can one get at a given age level white and 
Negro children who have similar grade classifications. 

The significance of social status was pointed out above. It should 
be noted here, therefore, that much of the sampling of investigators 
seemed to overlook the fact that comparatively few Negro children 
are represented in some of the professional groups, and have taken 
excessively disproportionate samplings of white children classified in 
the higher groupings. 

Pressey and Teter! compcre 1022 white children with 187 colored; 
Jordan? compares 1504 white with 247 colored; Witty and Decker‘ 
1725 white and 220 colored. Ferguson’s conclusions‘ regarding mulat- 
toes and blacks among Negroes were based upon comparisons of 2288 
mulattoes with only 155 blacks. If it is contended that these are not 
random samplings, generalized racial comparisons are vicious; if they 
are presented as random samplings, then the probabilities of dispro- 
portionate representation in the various social status levels are obvious 
in view of so few Negroes used. 

Considered with respect to the tendency to generalize on only 
a few cases of Negroes, it was illuminating to note the tabulation of the 
studies made between 1917 and 1925.5 There were twenty-five studies 
of the Negro. Of the twenty-three reporting the number of individuals 
tested as the basis for study, the number was less than one hundred in 
thirty per cent of the studies; less than one hundred fifty in forty- 
three per cent; less than two hundred in fifty-seven per cent; and less 
than two hundred fifty in seventy per cent. Yet these are studies 
quoted as the basis for generalizations regarding the intelligence of the 
Negro as a race! 

It is understood that mere numbers are not a guarantee of proper 
sampling. ‘The sample may be adequate in size and yet not be repre- 





1 Pressey, S. L. and G. P. Teter: A Comparison of Colored and White Children 
by Means of a Group Scale of Intelligence. Journal of Applied Psychology, Vol. 
III, 1919, pp. 277-282. 

2 Jordan, A. M.: Notes on Racial Differences. School and Society, Vol. XVI, 
1922, pp. 503-504. 

3 Witty, P. A. and A. I. Decker: Op. cit. 

4 Ferguson, G. O.: The Intelligence of Negroes at Camp Lee, Va. School and 
Society, Vol. IX, 1919, pp. 721-726. 

6 Garth, T. R.: Op. cit. 
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sentative. But when the question of reliability is raised, the signif- 
icance of a measure of reliability is conditioned upon a sufficiently 
large number of cases. 

In addition, then, to questioning generalized racial comparisons 
when the number of Negroes chosen is so small as to preclude even the 
possibilities of proportionate representation, there is the question 
arising from the failure of investigators to show treatment of data for 
indications of the statistical error of sma!' sampling. This, of course, is 
not a check of the randomness of the sampling. But it is related to the 
reliability of results reported from such few cases. ‘‘No summary 
statistical measure computed from a sample,”’ writes Chaddock, 
“should be stated without defining, if possible, its probable variation 
due to the accidental conditions of the sampling procedure. ’’! 

B. Variability—Reporting differences between the median or 
mean scores of the groups compared is a common practice. But the 
mere reporting of central tendency is worthless in view of the fact that 
the difference between two groups will depend upon the extent of the 
variability in the performance of the individuals within the two groups. 
Consequently a measure of variability is indispensable. In addition 
to this measure of variability there should be some indications of the 
statistical reliability of the difference between the measures. 

In his advocacy of the use of a common unit in the measurement of 
race differences, Peterson writes, ‘‘The practice by certain writers of 
reading the percentage of the Negro median of the white median as per- 
centage of ability is regarded as unfortunate. So-called qualitative 
differences were found to be in many cases unreliable or probably so; it 
is suggested that hereafter such differences be stated in quantitative 
terms, not merely in ‘better than’ or ‘inferior to’ terms. Comparisons 
of Negro scores with those of whites of different sections and different 
school systems show that certain ‘qualitative differences’ are not 
constant. ’” 

C. Verification—An experimental study submitted for scientific 
acceptance must be capable of verification. Just as scientific research 
requires a writer to indicate the source of his evidence or reference, it 
is also necessary for the investigator to report sufficient of the pri- 


mary data as to permit verification of his statistical treatment and 
interpretation. 





1 Chaddock, R. E.: “ Principles and Methods of Statistics.” P. 246. 
2 Peterson, J.: The Use of a Common Unit in the Measurement of Race Differ- 
ences. Psychological Bulletin, Vol. XX, 1923, pp. 424-425. 
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With reference to the reporting of data, Peterson believes that ‘‘In 
all measurements it is essential to give the primary data in some 
form . . . IQ’s and other mental coefficients vary so much for different 
tests as to have little value for direct comparisons.”’ Further ‘not 
only should differences be compared with their PE to ascertain their 
reliability, but their size should also be given. When this is done, 
many so-called ‘qualitative differences’ are found to be mere chance 
affairs. ’’! 

From the analysis of these three major conditions basic for valid 
interpretations, a check list seems desirable which one may use as a 
criterion for accepting an experimental study as valid for general- 
izations regarding comparative differences in mental ability of races. 
The following check list in the form of questions requiring affirmative 
answers is proposed: 

1. Have the individuals tested had identical or very similar oppor- 
tunities for gaining familiarity with the materials of the test which are 
assumed to be common? 

2. Were the provisions for formal education of the groups compared 
functioning on the same standard? 

3. Has the investigator checked the results of his measure of mental 
ability to ascertain their probable correlation with the results of some 
scientific measure of social status? 

4. Does the investigator base his comparison on differences more 
valid than the norms in the commonly used tests which reflect national 
standards for whites with educational conditions different from the 
Negroes tested? 

5. Are the scores or intelligence quotients used in the comparison 
the results obtained from the same test? 

6. If the study utilizes quotients obtained from different tests, 
has the investigator statistically equated the quotients? 

7. Is the test used one of unquestionable validity as a measure of 
mental ability? 

8. Is the test reliable? 

9. Has the investigator utilized a sufficiently large number of 
cases as to practically assure including a fair representation in dis- 
tribution of abilities, if the groups offer such possibilities? 

10. Has the investigator chosen enough cases as to eliminate 
differences of marked disproportionate magnitude between the number 
of cases in the samples compared? 





1 Peterson, J.: Methods of Investigating Comparative Abilities in Races. 
Op. cit., p. 183. 








Studies Pertaining to Racial Differences 27 


11. Does the investigator indicate the probable variation in results 
due to the accidental conditions of sampling? 

12. Has the investigator checked the reliability of an obtained 
difference with respect to the formulae for calculating the standard 
deviation or probable error of the difference between two measures? 

13. Was the sampling random? 

14. Does the investigator report the extent of variability as well as 
the central tendency of his data? 

15. Does the investigator report sufficient of the primary data in 
some form as to permit verification of the statistical treatment and 
interpretation of his data? 

16. Does the investigator report the conditions under which the 
tests were given in sufficient detail to permit one to know whether the 
conditions were standard, and were comparable for the different groups 
tested? 

In the light of these criteria we may conclude that (1) most studies 
so far reported are worthless as indicating anything regarding the 
comparative mental ability of races; (2) most of our present tech- 
niques give measures of differences due to weaknesses in educational 
opportunities rather than of differences in mental ability; (3) there 
is need of a re-evaluation of the problems and methods of studies per- 
taining to racial differences. 








AN EXPERIMENT ON TYPES OF MEMORY ABILITY 


DOROTHEA JOHANNSEN, MARGARET STIRLING, AND 
JANICE LEVINE 


Skidmore College 


The problem of the present study was twofold, the second part 
being subordinate to the first: (1) To determine whether memory 
ability is specific for the type of material learned, or quite general and 
independent of the type of material. (2) To determine whether, 
should the evidence point to specific types of memory ability, they 
depend upon native factors or result from training. If such special 
abilities should result from training, then an S would tend to do better 
in the field of his intellectual interest than in other fields. Therefore, 
two types of memory material—‘“literary’”’ and “scientific’? —-were 
chosen, and two corresponding types of Ss—one group whose major 
field was scientific and another group whose major field was literary— 
were used. Assuming special types of memory ability to exist, a 
positive correlation found between such abilities and fields of intel- 
lectual interest would suggest, but would not prove, that the special 
abilities are acquired; the lack of such a relationship, however, would be 
fairly definite evidence that such abilities are not acquired. 

Twenty-eight members of the faculty and twenty-nine students of 
Wellesley College served as Ss. Both faculty and student groups 
were divided into “literary’’ and “scientific”? types. The students 
were classified on the basis of the subject in which they were majoring, 
and the members of the faculty on the basis of the subject which they 
were teaching. Those Ss whose fields were on the border-line of 
science, such as history, political economy, and economics, were classi- 
fied in accordance with their minor subjects. 

Although this classification is too gross to indicate small differences, 
itwas hoped that if a marked difference existed, it would appear under 
such a division. With the number of Ss used, finer division would 
have resulted in groups too small to handle statistically. 

Conventional “literary” and “‘scientific’”’ passages were chosen for 
learning material. In so far as possible, it was desired to have the 
material of equal difficulty for all Ss. The two “literary” passages 
were from Ruskin’s Modern Painters; they were chosen for their 
purely sensory description and almost complete lack of logical thought 
other than the temporal sequence. The two “scientific” passages were 
selections from Sellar’s Essentials of Logic. These carry an idea 
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through with the clear reasoning necessary in all sciences without 
being partial to the subject-matter of any one. In selecting the 
material it was deemed best that the two passages of the same type be 
by the same author, so that no difference in style would introduce an 
unmeasurable error. The two passages of each pair were quite com- 
parable in difficulty, so far as this might be determined subjectively. 

The “‘Aussagemethode” was used. This method was chosen as the 
one best adapted to meaningful material of considerable difficulty, as it 
is uniform for all subjects, and is far less time-consuming than any 
method of complete mastery. The slight subjectivity in the scoring of 
the responses is a disadvantage in this method, but it was felt that the 
advantages it possesses for this type of material outweighed the 
disadvantages. 

The four passages were learned on four different days. Forty- 
eight hours were permitted to elapse between the first and second 
appointments, and between the third and fourth. At the first and 
second meetings the two selections of one kind of material were pre- 
sented, and at the third and fourth the two selections of the other type. 
The time between the second and third appointments was not strictly 
constant, and usually exceeded forty-eight hours in length. 

S was given the selected passage (type-written) and instructed to 
read it aloud, with intent to remember. She read it three times, and 
was then asked to write immediately as much as she could recall. 
There was no time limit for either reading or writing, but in general 
the whole procedure required from thirty-five to forty minutes, for 
each of the four passages. The method used in remembering the 
passage, whether S liked or disliked doing it, and whether she felt at all 
panicky at the problem set, was recorded at each appointment. 

Each of the four parts of the experiment was presented first in 
approximately the same number of cases. Those Ss classified as 
“scientific’’ were given one of the “‘scientific’”’ selections first, however, 
and the “literary” Ss were given a “literary” passage first, in an effort 
to set up a favorable attitude toward the problem. 

The score in each case was the number of ideas correctly recalled. 
“Correctly” was interpreted to mean, stated in a form which conveyed 
the same meaning, although the wording might differ from the original. 
Statements not contained in the original were ignored. 

All of the tests were given and-scored by the two junior authors, 
each doing approximately half of the tests. All the tests for one S 
were done by the same E. 
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RESULTS 


1. Specific vs. General Memory Ability.—If the ability of individuals 
for remembering meaningful material is general, then the kind of 
material used in measuring this ability should make no difference in 
relative scores. ‘The correlation obtained between scores on different 
types of material should then be approximately the same as the correla- 
tion obtained between scores on the same type. If, on the other hand, 
memory ability is dependent on the kind of material learned, then the 
correlation between tests on the same kind of material will be higher 
than the correlation between tests on different kinds of material. 

The correlation between the “literary”’ and the “scientific” scores 
was found to be .5312 + .095 (sigma) for the whole group, (fifty-seven 
cases). Each S took two “scientific’’ and two “‘literary”’ tests; there- 
fore, this correlation was computed between the average scores of 
the two tests on each type of material. 

The correlation between the scores on the two “‘literary”’ selections 
for the whole group was .8408 + 0.0256; that between the two “scien- 
tific” selections was .7516 + 0.0576. To make these correlations com- 
parable to the correlation between the ‘literary’ and “scientific” 
scores, however, it is necessary to correct for the difference in the 
length of the tests. As mentioned above, the two “scientific’’ and the 
two “literary” scores were averaged to obtain the score used in deter- 
mining the correlation between the two types of material; the correla- 
tion between the two “literary” selections, and the one between the 
two “‘scientific’’ selections, however, were necessarily based on tests 
only half as long. The two latter r’s were therefore corrected by the 
Spearman-Brown formula, with the result that the correlation between 
the two “literary” scores is raised to .9135 + 0.042 (corrected sigma), 
and that between the two “scientific” scores is raised to .8576 + 0.0377 
(corrected sigma). 

The reliabilities of the differences between these r’s are given in 
Table I. : 

These figures show that both differences are practically certainly 
reliable, the chances in both cases being 9999.99 out of 10,000 that the 
difference will remain above 0. 

We may therefore conclude that memory ability is not a general 
factor, but is more or less specific for the type of material learned. 

2. Memory and Intellectual Interest —Since we have found memory 
ability to be a function of the type of material learned, it appears 
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reasonable to suppose that those persons who are trained in the 
“scientific” field would do better, on the whole, on the “scientific’”’ 
selections, than those persons who are trained in the “‘literary’’ field, 
and vice versa. 
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“Literary” and “Literary’’....9135)/0.042 
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The results of this comparison, together with the differences and 
their reliabilities are summarized in Table II. 
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Analysis of these figures shows that on the “literary’”’ test the 
performance of the ‘“‘scientific’” group excels that of the “literary”’ 
group. On the “‘scientific’”’ test the reverse is the case; the reliability 
of the latter difference indicates its insignificance, however. These 
figures indicate that the training of these Ss does not increase their 
ability to remember the type of material most closely allied with their 
own work, but on the contrary, they make better scores in the field in 
which their major interest does not lie. 

It seemed possible that a positive relationship between the field of 
the intellectual interest and the score on the test might appear if the 
faculty members alone were considered, since the average length of 
training in the preferred subject-matter was much longer for the faculty 
members than for the students. Many college students have no very 
definite idea as to the field in which their chief intellectual interest lies; 
in any case, the college curriculum is so arranged that the student is 
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permitted little selection, and the training is consequently very 
generalized. 

The faculty members were classified into “literary” and “‘scientific”’ 
groups, and the average performance of these groups on the ‘‘scientific’”’ 
and “literary” tests calculated separately. For the purpose of 
comparison the students were similarly divided, and their scores 
computed. The results are given in Table III. 














TaBLeE III 
Faculty Students 
Test — — 
“Literary” | “Scientific” | “Literary” | ‘‘Scientific’’ 

a l - ; - 
“Literary”. ....... 37.84 47.96 49.77 | 53.91 
“‘Scientific”’........ 59.77 57.16 70.85 | 66.13 
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These figures show that for both faculty members and students, but 
for the former in a much more marked degree, the better score is made 
in the field in which the Ss intellectual interest does not lie. The 
smallness of these subdivided groups makes it impossible to determine 
the reliability of these differences, but the consistency of the result 
makes it appear that we are justified in concluding it to be fairly 
reliable. We offer as a tentative interpretation of the figures, the 
suggestion that the attitude of the Ss determined the result. In 
almost every case the S voluntarily said that the subject-matter of the 
tests which did not lie in his own field was the more interesting. 
“Literary” Ss complained that the Ruskin passages were ‘‘Ruskin at 
his Ruskinest,”’ while at the same time reporting that the “scientific” 
passages were ‘‘very interesting.’’ ‘‘Scientific’’ Ss, on the contrary, 
found the logic passages ‘‘dull’”’ but remarked on the “beauty of 
Ruskin’s descriptions.’”” The explanation of such a difference in 
“set”? may be merely the greater interest attaching to the more 
unfamiliar, or it may lie in the greater critical appreciation of the 
literature of one’s own field. 

3. Minor Results ——When the average scores of the faculty members 
are compared with those of the student group, the latter is found to 
excel on both the “literary” and “scientific” selections. The results of 
this comparison are given in Table IV. 

These figures indicate that regardless of the material used for the 
test the students excel the faculty members. Several possible 
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interpretations suggest themselves. In the first place, these data tend 
to substantiate the suggestion of the more generalized training of the 
students, which results in their obtaining higher scores in both sections 
of the test. Another possible interpretation is the difference in ages 
which is found in the two groups. The correlation between the 
student’s ages and scores was 0, but the age-range here was too small to 
be effective. In the faculty group alone the rho between the ages and 
scores was —.134, 7z.e., there is a very slight tendency for the older 
members of the group to get better scores,—a result which is rather 
contradicted in the better performance of the students. In view of the 
unreliability of the rank-difference method of obtaining correlations 
in general, and the small size of this one in particular, we are probably 
justified in concluding that age and memory ability do not show a 
definite correlation. A third explanation of this difference in perfor- 
mance lies in the fact that students are daily required to reproduce from 
memory material not so very different from that which was presented 
them in this experiment, and after a method of learning which does not 
differ to any great degree from the method used here. Finally, there is 
considerable difference in the relative motivation of the two groups; 
this is the factor that appears to us most significant. Students havea 
well-developed competitive spirit, and the present experiment was 
looked upon as a kind of examination, in which every S desired to excel 
the other Ss, particularly the faculty! Such motivation was lacking in 
the faculty group and nothing so compelling substituted for it. 
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SUMMARY AND CONCLUSIONS 


1. Under the conditions of this experiment we find the score made 
on a memory test to be a function of the kind of material learned. 

2. The fields of intellectual interest of college faculty and students 
do not, apparently, determine the kind of material most easily 
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remembered. Therefore, the most probable inference is that the dif- 
ference in ability here found is innate rather than acquired. 

3. The performance of the students was superior to that of the 
faculty in both sections of the test. The more generalized training and 
more compelling motivation of the students are suggested as possible 
explanations of this difference. 
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A LABORATORY STUDY OF THE READING OF 
FAMILIAR NUMERALS 


G. NEVIN REBERT 


Hood College 


THE PROBLEM 


The purpose of this study is to determine how familiar numerals 
are read when they appear in context. The investigation is concerned 
with the differences, if any, which exist between the eye-movements 
made during the reading of words in context and those made during 
the reading of familiar numerals in context. 

A previous investigation! has shown that numerals are read in a 
manner characteristically different from that in which words are read. 
In the case of numerals, the digits appear in constantly changing 
combinations; the letters in words appear repeatedly in the same 
combinations. In consequence, because the reader cannot anticipate 
the sequence of the digits as he can the sequence of letters in words, he 
tends to make more fixations, more regressions, and fixations of longer 
duration in reading numerals than in reading words. 

The present investigation is concerned primarily with the reading of 
familiar numerals. Certain numerals occur so frequently in textual 
material that it seemed desirable to determine what modifications, if 
any, occur in the eye-movements of those who have occasion frequently 
to read them. 


PROCEDURE 


Subjects —A total of one hundred six subjects was used in this 
investigation. Forty-six of these subjects, comprising two class 
groups, were second year high school pupils who participated in a 
preliminary investigation of the reading of familiar dates. The 
remaining sixty subjects who participated in the laboratory investiga- 
tions were high school and university students, and university instruc- 





1 Terry, Paul Washington: How Numerals Are Read: An Experimental Study 
of the Reading of Isolated Numerals and Numerals in Arithmetic Problems. 
Supplementary Educational Monograph, No. 18, Chicago: Department of Educa- 
tion, University of Chicago, 1922. 
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tors. Eleven of these subjects who read a geometry selection were 
advanced university students in mathematics. 

The Reading Selections.—Two reading selections were used in the 
investigation. One was taken from the subject of American history 
and the other from geometry. The history selection contained five 
dates presumably familiar to high school pupils. In order to deter- 
mine the amount of detail with which these familiar dates were read, 
the date of the Civil War was given erroneously as 1868. The familiar 
3.1416 appears three times in the geometry selection. 

The selections, which are shown in Plates I-IV below, were printed 
in lines four inches in length. Twelve point type was used. The 
spacing between the lines was approximately one-eighth of an inch. 
The history selection will be designated hereafter as Selection D and 
the geometry selection as Selection G. 

Techniques of the Investigation.—Forty-six second year high school 
pupils read Selection D as a group silent reading test under customary 
classroom conditions. After these pupils had completed a single rapid 
reading of the selection, they were given a mimiographed copy which 
differed from the original in that it contained blank spaces instead of 
the dates. They were instructed to write in the blank spaces the dates 
which they recalled. 

The remaining sixty subjects read Selection D before the eye-move- 
ment camera. Eleven of these subjects read Selection G under the 
same conditions. 

The apparatus used in the investigation was the eye-movement 
camera in the laboratory of the School of Education at the University 
of Chicago. The location, duration, and sequence of fixations made 
upon the reading selections were determined from the film records by 
the method that has been standardized in previous investigations. 
In interpreting the records of the reading of the numerals in this 
investigation, however, a fixation falling upon a numeral is defined as 
one located upon the numeral or within the space on either side of the 
numeral. Previous investigation’ has shown that the “point” of 
fixation is in reality an area approximately three or four letter spaces in 
width, in any point of which the center of foveal vision may be located. 
It is assumed, therefore, that if the “‘point’”’ of fixation is located 





1 Ruediger, Wm. C.: The Field of Distinct Vision. Columbia University Con- 
tributions to Philosophy and Psychology, and Education, Vol. XVL, No. 1, New 
York: Science Press, 1907. 
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outside of the space on each side of the numeral, the chances are greater 
that the fixation did not actually fall upon the numeral than that it 
did. 

The same procedure was followed by the investigator in connection 
with the reading of every subject who participated in the laboratory 
investigation. After each subject had been placed at the apparatus, he 
read a practice card twice under the same conditions under which the 
experimental selections were read. The verbal instructions in each 
case were as follows: ‘‘The history (geometry) selection you are about 
to read contains a number of statements with which you are familiar. 
Do not study it. Read as rapidly as you can—consistent with under- 
standing.”’ After the reading of each selection the following questions 
were asked: Selection D—‘‘What was the date of the Civil War as you 
recall reading it?’”’ Selection G—‘‘What numeral and formula have 
you just read?’’ Each subject was also asked to record his 
introspections. 


RESULTS 


Accuracy with Which Dates Are Read.—Twenty-four of the forty- 
six high school pupils who read Selection D as a group test reproduced 
correctly the erroneous date 1868. In addition two subjects filled in 
1868 in the spaces for the dates of the Spanish-American and World 
Wars, respectively. Also, four other subjects wrote the date 1886 
instead of 1868. The familiar dates 1776, 1812 and 1917 were incor- 
rectly reproduced by less than ten per cent of the subjects; 1898, by 
thirty-five per cent. Familiarity with the dates may have operated 
either to facilitate correct recall, or to cause the subjects to substitute 
the dates from the context rather than from the memory of actual 
perception. On the basis of the reading of 1868, however, it seems that 
at least fifty per cent of the subjects actually perceived the dates in 
Selection D. 

Twenty-three of the sixty subjects (38 per cent) who read Selection 
D under laboratory conditions read 1868 correctly. Although 
this is a somewhat lower percentage than in the case of the high school 
pupils, the conditions were somewhat different in the two cases. 
The higher degree of accuracy on the part of the second year high 
school pupils may have been due to the fact that children are likely to 
pay more attention to details than adults. Introspections of a 
number of the adults in the laboratory group indicate that they 
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neglected accuracy of detail for the general ideas expressed in the 
context. 

Eye-movements in Reading Familiar Numerals.—A summary of the 
records of the reading of Selections D and G are shown in Table I. 
Typical reading records of four subjects are shown in PlatesI-IV.' It 
may be observed in Table I that there is a tendency for the subjects to 


TaBLE I.—A SuMMARY OF THE EYE-MOVEMENT RECORDs oF Srxty SUBJECTS IN 
THE READING OF SELECTION D AND OF ELEVEN SUBJECTS IN THE READING 
oF SELECTION G 











Mean 
Reading Duration of | Number of Number of 


selection fixations per! fixations per 


Line | Numeral Line Numeral Line Numeral 








D 6.3 7.3 7.8 
G 5.8 


1.2 1.3 .09 
7.5 | 7.8 1.5 1.1 
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1 Time in units of one twenty-fifth second. 


use fixations of slightly longer duration to read the numerals than to 
read the entire context. Seventy-five per cent of the subjects used 
fixations of as long or longer duration to read the dates than to read the 
words. All of the subjects used fixations of longer duration to read 
3.1416 than to read the words. The relatively large mean number of 
fixations per line (7.8), together with some of the introspective evidence 
seems to indicate that the laboratory subjects tended to read the 
selections critically rather than in a cursory manner. One subject 
said “‘I read almost word for word.”’ Another subject said, ‘Being 
asked questions concerning these cards caused me to read more 
carefully.’’ The mean number of regressions per line shown in Table I 
is not excessive in the case of the reading of either selection when it is 
taken into considefation that the reading attitude was in large measure 
analytical in character. In the following paragraphs these tendencies 
in respect to the number and duration of fixations and the number of 
regressive fixations are discussed. 





1The numbers above the vertical lines indicate the order of fixations; those 
below the lines give the duration of the fixations in units of twenty-fifths of a 
econd. 
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The Number of Fizxations Made upon the Numerals.—In Table II 
are given the distributions of the number of subjects who read a given 
numeral with a given number of fixations. Two significant facts are 
revealed by this table. In the first place it indicates that familiar 
numerals tend to be read like words. In thirteen per cent of the cases 
the numerals were not fixated directly. They were fixated once in 
fifty-six per cent of the readings. This makes a total of sixty-nine per 


TaBLE I].—DIstTRIBUTION OF THE SuBJECTS WHO Reap SELEcTIONS D anp G 
ACCORDING TO THE NUMBER OF FIxaTIONS MADE UPON CERTAIN NUMERALS 
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4 oa 2 1 | i 0 
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1 Regressive fixations involved. 


cent of the cases in which numerals were read, if read, as familiar words 
are read by mature subjects. Reading records of this type are shown 
in Plates I and III. 

In the second place, Table II shows that there is a tendency to read 
the numerals in detail. More than one fixation was made upon a 
numeral in thirty-one per cent of all the readings of numerals. Records 
of such detailed readings are shown in Plates II and IV. 

Number of Fizations Made upon Numerals and Accuracy of Reading. 
Sixty-five per cent of the subjects who read the erroneous date, 
1868, correctly made one fixation upon it or did not fixate it directly. 
Sixty-two per cent of the subjects who read it inaccurately made one 
fixation upon it. Approximately the same percentage of each of these 
groups made two or more fixations upon 1868. It seems clear, there- 
fore, that the number of fixations made upon the dates bore little, if 
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any, relationship to the accuracy of the reading. It is evident also 
that it is possible to read a date accurately during one fixation when it 
appears in context. 
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PuatTE I.—Record of the reading of familiar dates like words in context. 


Regressive Reading of Familiar Numerals.—Twenty-one of the sixty 
subjects who read Selection D made one or more regressive fixations 
caused by dates. The largest number of regressions was caused by the 
date, 1868. In all probability the error in the date was responsible for 
this fact. Six of the nine subjects who made regressive fixations upon 
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1868 read it accurately. The records of three of the six show definite 
areas of confusion centering around 1868. In her introspection, one 


subject specifically called attention to the fact that she was confused 
by the error in the date. 
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Prate II.—Record of the reading of familiar dates in detail. 


Only one regression terminated on the date, 1812. This may have 
been merely a chance happening. On the other hand, this date is part 
of the familiar expression, War of 1812. In this sense it is probably the 
most familiar of the five dates. 
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Pate III.—Record of the reading of 3.1416 like words in context. 
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Puate IV.—Record of the reading of 3.1416 in detail. 
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Fifteen of the twenty-one subjects who made regressive fixations on 
the dates read only one date of the five with a regressive fixation. Five 
subjects read two of the five dates regressively. One subject read 
three of the five dates regressively. The total number of regressions 
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Horizontal lines represent path of eye-movement. Vertical lines projected upward locate points of 
fixation. Arrow indicates direction of eye-movement. 


Piate V.—Four types of regressive readings of the familiar numeral, 3.1416. 


made by the twenty-one subjects is twenty-eight. A total of one 
hundred five dates was read. Thus in 74.3 per cent of the cases no 
regressive readings were made by the twenty-one subjects. The 
entire group of sixty subjects made a total of three hundred readings of 
dates. More than ninety per cent of these readings were made without 
regressions. If the regressions caused by the erroneous date, 1868, are 
subtracted, the percentage will be still larger. 
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There are four types of regressions that occurred in connection 
with the reading of 3.1416 in the context of Selection G. Examples of 
these types are shown in Plate V. All four of these types are illustra- 
tive of the fact that a change in reading attitude may be caused by the 
presence of this familiar numeral in context. 

The difficulty that is encountered in readings of Types I to III 
seems to be concerned with fitting 3.1416 into the context rather than 
with the perception of 3.1416. In none of these types were two suc- 
cessive fixations made upon the numeral. The regressive fixations 
invariably were related both to the words and the numeral. On the 
other hand, the regressions of Type IV were confined entirely to 3.1416. 
This seems to warrant the assertion that this type of regression was 
caused by difficulty in perception. 

Types I and II both involve the element of rereading. In Type I 
the re-reading seems to have been made with the purpose of relating 
3.1416 to a preceding or a following word. The regressive reading of 
Type II seems to have been made with the purpose of fitting the 
expression, 14 of 3.1416, into the thought of the context. 

Type III, as represented by the record of reading of Subject N, 
Plate V, probably is illustrative of central confusion of one kind or 
other. Subject H made three attempts to begin the reading of the line 
that starts with the expression, by 14 of 3.1416. His third attempt was 
successful. As in Type II the difficulty centered around the phrase. 
In Type II, however, there seems to have been two orderly readings, 
while in Type III one orderly reading seems to have been preceded by a 
period of confusion. 

It seems evident on the basis of the preceding discussions that it is 
possible to read the familiar numerals in the context of Selections D and 
G in a manner approaching that in which words are read in context. 
The records of the reading of these numerals by a number of subjects 
reveal, however, that detailed readings with two or more fixations are 
likely to occur. This tendency to detailed reading, together with the 
difficulties revealed through a study of the regressive readings, warrants 
the conclusion that the reading of numerals does not quite become the 
same as the reading of words, even under the most favorable conditions. 


CONCLUSIONS 


1. The familiar numera's used in the present investigation tend to 
be read in the same manner as the words in the passages in which they 


appear. 
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2. In a number of cases detailed readings involving two or more 
fixations occur. 

3. There are two methods of reading familiar dates in context. In 
some cases the dates are neglected for the general idea; in others they 
are read with accuracy. 

4. There seems to be no relationship between the number of 
fixations and the degree of accuracy with which the familiar dates were 
read. 

5. There is a pronounced tendency to read the familiar numerals 
with one fixation. 

6. A minority of subjects read the numerals in detail with more 
than one fixation. 

7. Excepting in the case of the erroneous date, 1868, there is no 
conclusive evidence that dates in context cause an increase in the 
number of regressions made by mature subjects. 

8. An error in the date of the Civil War was responsible for the 
appearance of an area of confusion in the records of a number of 
subjects. This confusion was indicated by a number of regressive 
fixations. 

9. The presence of 3.1416 in the context seems to cause regressive 
fixations and rereadings in a number of cases. These regressions seem 
to indicate difficulty (a) in associating 3.1416 with a word in the con- 
text, (b) in associating a phrase in which 3.1416 appears with the other 
parts of the context, (c) in reading 3.1416, or (d) to indicate a period 
of detailed analysis probably caused by the presence of 3.1416 in the 
context. 

10. There is a tendency to make fixations of longer duration upon 
the numerals than upon the words. Whether this is on account of the 
significance of the dates to the context or on account of perceptual 
difficulty, does not appear. 
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THE PERSISTENCE OF LEARNING IN ELEMENTARY 
ALGEBRA! 


EDNA THOMPSON LAYTON 


Senior High School, Baldwin, N. Y. 


The average teacher of Plane Geometry assumes that her pupils are 
well grounded in Elementary Algebra and can, therefore, work without 
difficulty any problem involving algebraic procedure. The necessity 
of explaining, from time to time, the various algebraic processes needed 
in the solution of numerical originals in Plane Geometry gives rise to 
the question of just how much knowledge of Elementary Algebra 
actually is retained by the average pupil. 

Kikenberry? and Thorndike* made earlier studies to test retention 
but they assumed that the content the retention of which they were 
measuring had at one time been known. Worcester‘ made a similar 
study in the field of Elementary Algebra, first determining what was 
known, but he fails to state whether or not the pupils he tested were 
studying mathematics during the time between the tests. It seems 
probable from what he does say that they were studying Elementary 
Algebra at least a part of that time. Worcester based his conclusions 
on a study of twenty-two cases and their retention over a period of from 
six to eight months. Therefore none of these studies has determined 
accurately the amount of previously learned algebraic knowledge 
actually retained. 

In order to do this and to solve some related problems the study 
herein reported was undertaken in the Milne High School, the practice 
school for New York State College for Teachers. This study was made 
of the results of fifty-one cases and the retention period was one year 
during which the pupils were receiving no instruction in any kind of 
mathematics. Stated severally the purposes of the study were: (1) 
To find the amount of knowledge of Elementary Algebra retained over 
a period of one year during which no mathematics of any kind was 





1The data presented here are taken from a Master’s Thesis written at New 
York State College for Teachers. 

2 Kikenberry, D. H.: Permanence of High School Learning. Journal of Edu- 
cational Psychology, Vol. XIV, 1923, pp. 463-481. 

’ Thorndike, E. L.: “The Psychology of Algebra.’’ New York: MacMillan, 
1924, pp. 452-457. 

4 Worcester, D. A.: Permanence of Learning in High School Subjects; Algebra. 
Journal of Educational Psychology, Vol. XIX, May, 1928, pp. 343-345. 
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studied; (2) to compare the retention of boys with that of girls; (3) 
to determine whether there are certain facts or skills that are retained 
better than others; (4) to find the amount of knowledge gained in a 
month’s intensive review and to determine the relation of the amount 
of knowledge gained in the month’s intensive review to the amount lost 
in the next eleven months; (5) to determine the relation of the amount 
of ability retained to the intelligence of the pupils; (6) to determine the 
relation of intelligence to the retention of manipulative technique and 
to ability to solve verbal problems. 

The experimental group for this study was composed of fifty-one 
ninth year pupils of which thirty-nine were girls and twelve were boys. 
These pupils had been roughly grouped as to ability and were taught 
by eight different college seniors (one teacher for each class for each 
semester) under the personal supervision of the writer. As there is a 
slight tuition charge for attendance at this school the children probably 
came from homes slightly above the average. It does not seem, how- 
ever, that this fact should detract from the value of the results obtained. 
The Intelligence Quotients of this group ranged from ninety to one 
hundred twenty-eight with a median of one hundred fourteen and were 
obtained by means of the Otis Group Intelligence Test. 

The experimental test used was the New York State Regents’ 
Examination for August, 1928. This examination was divided into 
two parts. Part I “‘is intended to give a comprehensive test on the 
manipulative technique of the subject and whose rating permits of no 
partial credit; and part II which is intended to test the applied side of 
the subject in the solution of verbal problems and whose rating does 
permit of partial credit.”' This test was first given to the group on 
May 14, 1929, directly following the completion of the last new work to 
be presented in the course. The pupils tried hard to do their best, being 
very anxious to see how high a score they could obtain on an actual 
Regents’ Examination which had been given in the past. The papers 
were not returned to them nor the solution of the questions discussed 
with them, but their marks were given to them and proved to be a 
splendid means of motivation for the month’s intensive general review 
which immediately followed. The same test was again given to the 
same group on June 13, 1929 in the nature of a ‘‘Preliminary Examina- 
tion.”’ None of the pupils seemed to recognize that they had taken 
the same test a month earlier and they worked with great care because 





1“*A Tentative Syllabus in Elementary Algebra.’”’ Albany, New York: The 
University of the State of New York, 1928, p. 5. 
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they knew it was a test similar to the Regents’ Examination which was 
to follow within a few days. Again the pupils did not receive their 
papers nor was the solution of the questions discussed, but they were 
given the marks they obtained on the test. On June 17, 1929 the 
group took the final Regents’ Examination, the passing of which marks 
the completion of the study of Elementary Algebra. The passing 
grade for this examination is sixty-five per cent. The scores made by 
the pupils on this test are considered in this report so that it may be 
seen that the test used for experimental purposes was a typical Regents’ 


TaBLeE I.—MeEan, MEDIAN AND RANGE OF SCORES BY PARTS AND FOR THE WHOLE 
FOR THE THREE EXPERIMENTAL TESTS AND THE FINAL TEST 





Mean | Median | Range 





Experimental Test of May, 1929.......... Part I 34.6 35 20- 48 
Part II | 36.7 41 9 50 
Whole 71.3 74 32— 95 





Experimental Test of June, 1929.......... Part I 42.6 43 30— 50 
Part II | 44.5 47 20- 50 
Whole 87.1 89 53-100 




















Experimental Test of May, 1930.......... Part I 24.4 23 8- 43 
Part II | 31.8 34 6— 50 
Whole 56.2 56 21— 90 
Regents’ Examination of June, 1929...... Part I 41.5 40 30-— 50 
Part II | 47.2 48 33-— 50 
Whole 88.7 87 68-100 





Examination. On May 12, 1930, eleven months later, the same group 
was again given the experimental test. During the interval between 
the final Regents’ Examination and the third application of the 
experimental test these pupils had not received instruction in any kind 
of mathematics as the study of Plane Geometry is not begun in this 
school until the eleventh year. No opportunity for reviewing Ele- 
mentary Algebra was afforded the pupils as they were not told the 
exact nature of the test but were asked to take an examination which 
would help to determine what they needed to be taught as a basis for 
the study of Plane Geometry. It is a safe assumption that these 
pupils were not using Elementary Algebra to any degree in their life 
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outside of school. The pupils were much interested in seeing how 
much they remembered and made an effort to do their best. Many 
of them inquired for their grades after the test had been corrected. 

From detailed tabulation of the scores by parts and as a whole 
obtained by each pupil on the three experimental tests and on the final 
Regent’s Examination Table I was derived. 

A study of Table I reveals several interesting facts. The distribu- 
tion of scores for each test given was as near normal as could be 
expected as evidenced by the similarity of the mean and median in each 
test. The final Regents’ Examination had a mean, median and range 
closely resembling that of the second experimental test given. This is 
a proof that the Regents’ Examination chosen as an experimental test 
was a typical one. The amount of algebraic knowledge retained as 
shown in the third experimental test results corresponds more closely 
with the amount known before the intensive review than with the 
amount known at the end of the course. In other words, these pupils 
do not seem to retain as well the work learned during the review period 
as they do the work learned during regular class instruction. 

As a further means of determining the amount retained a compari- 
son of the gain per cent of the second experimental test over the first is 
made with the loss per cent of the third test over the second. This is 
done in Table II. Cases 4, 21, and 27 were absent from school during 
the review period and did not, therefore, have an equal opportunity for 
improvement with the other pupils. 

A study of Table II reveals that the majority of pupils lost more in 
the eleven months during which they received no instruction in mathe- 
matics than they gained in the month’s intensive review. The average 
gain per cent of the second experimental test over the first is twenty- 
three per cent. The average loss per cent of the third test over the 
second is thirty-six per cent. In other words, these pupils, measured 
in this way and knowing what they did at the end of the course, 
retained about one-third of what they knew. 

As a check on the amount of knowledge retained the ten boys for 
whom there are available Intelligence Quotients are paired with ten 
girls with the same average intelligence (110.7). For no pair do these 
IQ’s differ by more than three points. The range is from ninety to 
one hundred twenty-three. The paired cases are shown in Table III. 

A study of Table III shows that these boys knew less before the 
review period than did the girls and that during the review period they 
gained more than did the girls, making an equal average for the two 
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groups at the end of the course. The average per cent of increase 
during the review period is thirty per cent for the boys and eighteen per 
cent for the girls. Considering these twenty pupils as a single group 


TaBLE II.—-Tue Gain or Loss Per CENT BETWEEN THE FIRST AND SECOND AND 
THE SECOND AND THIRD EXPERIMENTAL TESTS 

















Gain per Loss per Gain per Loss per 
Case No. | cent of two | cent of three || Case No. | cent of two | cent of three 
over one over two over one over two 

1 46 7 an |; 19 

2 11 61 | 28 19 22 

3 15 33 29 11 47 

4 0 34 30 19 34 

5 108 8 31 163 42 

6 5 24 32 14 15 

7 19 35 33 30 34 

8 4 27 34. 7 53 

9 3 21 35 | 9 38 
10 21 44 36 CO 18 31 
11 4 32 37. 79 70 
12 17 6 38.Ciéds 65 64 
13 56 54 39 «|| 15 34 
14 5 15 40 | 7 23 
15 4 43 41 16 28 
16 52 60 42 | 61 43 
17 62 34 43. | 22 14 
18 43 27 44 13 64 
19 4 28 45 | 7 0 
20 28 52 46 CS 84 56 
21 —1 34 47 | 6 65 
22 14 42 48 31 66 
23 16 10 49 | 23 60 
24 8 3 50 59 23 
25 5 12 nn .6lUt|ltc(it ae 75 
26 13 42 




















the average per cent of increase would be twenty-four per cent which 
checks as closely as can be expected with the twenty-three per cent 
increase of the fifty-one cases. It is also evident that the boys lost 
more knowledge in the eleven month period between the second and 
third experimental tests than did the girls. The average loss per cent 
for the boys is forty-eight per cent and for the girls is twenty-eight per 
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cent. This gives a loss per cent of thirty-eight per cent for them as a 
single group which checks the loss of thirty-six per cent obtained for the 
fifty-one cases. It should not be inferred that boys will always gain 
more than girls during a review period and retain less than girls. There 
are not enough cases available to pair to make such a general conclusion. 


TasLe III.—Case Numsers, IQ’s anp GRADES ON THE THREE EXPERIMENTAL 
TESTs FOR THE TWENTY PartreD Cases 












































Boys | Girls 
| Test grades | | Test grades 
CaseNo. | IQ | | CaseNo. | IQ |—— 

| iaisia | 1/2/83 

| | i | 
12 120 69 | 81 | 76 23 119 86 |100 | 90 
13 105 52 | 81 | 37 || 22 103 79 | 90 | 52 
18 105 63 | 90 | 66 || 8 105 81 | 84 | 61 
29 114 84 | 96 | 63 4 114 91 | 91 | 68 
37 116 48 | 86 | 26. 24 116 80 | 86 | 83 
40 123 86 | 92 | 71. 30 123 81 | 96 | 63 
44 105 80 | 90 | 32 || 38 105 46 | 76 | 27 
47 116 86 | 91 | 32 | 21 115 88 | 87 | 57 
49 90 61 | 75 | 30 | 5 93 40 | 83 | 76 
51 113 41 | 85 | 21 | 36 114 66 | 78 | 54 
Average grade......... | 67 | 87 | 45 | Average grade.......... | 74 | 87 | 63 

















Tabulation and analysis of the pupils’ answers question by question 
revealed that, on Part I of the experimental test, questions number 2, 3, 
5, 6, 10, 11, 15, 16 and 20 were retained by a majority of the pupils who 
took the tests. What type of questions are these which are retained? 
A study of the question paper reveals that questions 2, 3, 4 and 16 
involve a knowledge of factoring, that question 6 involves substitution 
in a formula, that questions 10 and 11 are verbal problems which 
require reasoning ability, that question 15 is a yes-no question that 
might be guessed but which, if solved, requires a knowledge of checking 
roots for quadratic equations and that question 20 involves a knowledge 
of finding an average. The first and nineteenth questions on this 
part were questions which as many pupils retained as forgot. This 
means that a little more drill was needed on this type of question. 
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Question 1 involves factoring by trial and question 19 the solution of 
a simple verbal number problem. But the questions to which more 
serious attention would be given are those which were not retained by 
the majority of the pupils. These are questions 4, 7, 8, 9, 12, 13, 14, 
17 and 18. Question 4 involves reasoning and a knowledge of algebraic 
addition and subtraction, question 7 a knowledge of addition of frac- 
tions, questions 8 and 9 the solution of numerical and literal fractional 
equations, questions 12 and 13 the manipulation of radicals, question 
14 the solution of a quadratic equation, question 17 the finding of an 
arithmetic square root and question 18 reasoning ability and a knowl- 
edge of algebraic division. Assuming that this teaching was done 
under representative school conditions, more drill should be given to 
pupils on this type of manipulative technique if they are to retain a 
working knowledge of it. 

Part II cannot be considered in exactly the same manner as Part I 
since students are allowed a choice of five questions out of eight on this 
part. It is a safe assumption, however, that the questions which were 
not chosen were omitted because of a lack of knowledge of how to 
solve them. Questions 21, 23, 27 and 28 seemed to be most often 
chosen and best retained. Question 21 is a number problem; question 
23 is a number problem involving fractions; question 27 is a true-false 
question which might, of course, be guessed; and question 28 is a bar 
graph. These things, then, under average conditions are sufficiently 
drilled on and as well retained as can be expected. Questions 22, 24, 25 
and 26 were not often chosen nor well retained. Question 22 involves 
a knowledge of the solution of simultaneous quadratic equations; 
question 24 is a measurement problem; question 25 is the solution of 
a quadratic equation correct to the nearest tenth; and question 26 
is a substitution problem that is very wordy. It is probable that the 
length of the statement of question 26 rather than a lack of knowledge 
of the method of solving it led the pupils to omit it. Under average 
conditions these are the types of problems on which more drill is 
needed if the pupils are to retain a working knowledge of them. 

In order to determine the relation of success in the several tests to 
the intelligence of the pupils the results in Table IV were obtained. 

Table IV reveals that the evidence that pupils will have the same 
ranking in the final Regents’ Examination that they have in a test of 
retention is present but low (.395). There is greater evidence (.496) 
that pupils will have the same ranking in the test to determine the 
amount retained as they have in a test given before the review period. 
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The evidence that pupils will rank the same for the test of retention as 
they do for Intelligence Quotients is present but low (.293). A com- 
parison of the correlation coefficients obtained by correlating the 
Intelligence Quotients with Part I (.239) and with Part II (.267) of the 


test of retention sccms to indicate that the ranking according to 
Intelligence Quoticnts more closely resembles the ranking of pupils on 
Part II (ability to seive vernal orcblems) than it does that of Part I 
(manipulative techyicuc) This evidence, however, is present but low. 

In summarizing, ¢ conclusions made from this study are: 

1. Pupils retain « ~ut one-third of the knowledge of Elementary 
Algebra once known over © oc riod of one year during which they receive 
no instruction in maticmetics. 


2. Evidence indicates that retention of knowledge of Elementary 
Algebra can be predicted by a test given before the intensive general 
review period better than by an examination at the end of the course. 


TaBLE IV.—TuHE NuMBER OF CASES AND CORRELATION COEFFICIENTS FOR 
CoRRELATIONS OF TESTS AND [Q’s wiTH THE May, 1930 TEst 











Test of May, 1930 | Correlated with | r Cases 
] 
Total score.....................| Testof May, 1929 | .496 + .071 51 
Total score..................... Test of June, 1929 | .395 + .080; 651 
ND ncn ccoic i awlia a0 tn 1Q’s .293 + .087| 48 
er ere IQ’s .239 + .089| 48 
| | : rr 1Q’s .267 + .088 48 
i 








3. For the data available girls tend to retain better than boys but 
the number of cases available is too small for the writer to be able to 
generalize about the retention of boys versus girls. 

4. Pupils tend to retain best a knowledge of factoring, substitution, 
verbal problems, finding an average, number problems involving 
integers and those with fractions and the construction of graphs. 

5. Pupils do not tend to retain a knowledge of the manipulation 
of fractions and fractional equations, of the solution of quadratic 
equations that have even answers and those which are solved correctly 
to a decimal place, of square root, of the solution of simultaneous 
quadratic equations and of the solution of measurement problems. 
These facts and skills should, therefore, be emphasized more than they 
are at present by the average teacher of Elementary Algebra. 
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6. The amount of knowledge gained in the month’s intensive 
general review is about two-thirds of that lost in the eleven months 
during which there is no instruction in mathematics. 

7. There is some evidence that the ranking of pupils according to 


IQ’s and ranking according to the amount of knowledge retained 
tends to be similar. 


8. The evidence that the ranking of pupils according to 1Q’s more 
closely resembles their ranking according to retention of ability to 
solve verbal problems than it does their retention of manipulative 
technique is present but low. 


BIBLIOGRAPHY 


Anderson, J. P. and A. M. Jordan: Learning and Retention of Latin Words and 
Phrases. Journal of Educational Psychology, Vol. XIX, October, 1928, pp. 
485-496. 

Atkins, E. W.: Relation between Thinking and Memory in Mathematics. School, 
Science and Mathematics, Vol. XXIII, November, 1923, pp. 760—770. 

Bassett, S. J.: Retention of History in the Sixth, Seventh and Eighth Grades with 
Special Reference to the Factors that Influence Retention. Journal of 
Educational Psychology, Vol. XX, December, 1929, pp. 683-690. 

Butler, C. H.: Role of Memory in Algebra. School, Science and Mathematics, 
Vol. XXII, June—December, 1922, pp. 523-534, 613-627, 723-728, 850-856. 

Cober, E. W.: ‘‘A Study of High School Pupils with a View of Determining the 
Extent of Recollection of Once Familiar Facts.”” University of Pennsylvania, 
1912, 46 pp. 

Drushel, J. A.: A Study of the Amount of Arithmetic at the Command of High 
School Graduates Who Have Had No Arithmetic in Their High School Course. 
Elementary School Journal, Vol. XVII, May, 1917, pp. 657-661. 

Eikenberry, D. H.: Permanence of High School Learning. Journal of Educational 
Psychology, Vol. XIV, November, 1923, pp. 463-481. 

Gilkey, Royal: Correlation of Success in Subject Matter Courses in High School 
with Success in the Same Subjects in College. School Review, Vol. XX XVII, 
October, 1929, pp. 576-588. 

Jones, Harold E.: Experimental Studies of College Teaching: The Effect of Exami- 
nation on Permanence of Learning. Archives of Psychology, Vol. X, 1923, 70 
pp. , 

Lee, Ang Lanfen: An Experimental Study of Retention and Its Relation to Intelli- 
gence. Psychological Monographs, Vol. XXXIV, 1925, 45 pp. 

Luh, Chih Wei: The Conditions of Retention. Psychological Monographs, Vol. 
XXXI, 1922, 87 pp. 

Rugg, Harold O.: ‘Statistical Methods Applied to Education.” New York: 
Houghton Mifflin Company, 1917, pp. 233-300. 

Thorndike, E. L.: Permanence of School Learning. School and Society, Vol. XV, 
June, 1922. pp. 625-627. 








Learning Elementary Algebra 55 


Thorndike, E. L.: “‘The Psychology of Algebra.” New York: Macmillan 
Company, 1924, pp. 452-457. 

Worcester, D. A.: Permanence of Learning in High School Subjects; Algebra. 
Journal of Educational Psychology, Vol. XIX, May, 1928, pp. 343-345. 


Miscellaneous 


‘“‘A Tentative Syllabus in Elementary Algebra.’’ Albany, New York, The 
University of the State of New York, 1928, 12 pp. 











A GRAPHIC METHOD OF FINDING STANDARD 
ERRORS AND PROBABLE ERRORS OF DIFFERENCES 


HAROLD A. EDGERTON 
Ohio State University 


In experimental education and psychology a large proportion of 
the comparisons of different populations involve the use of the standard 
error or probable error of the difference of two statistics. 

A graphic device for computing the standard errors of differences 
and probable errors of differences is given by the accompanying graph. 

The graph is usable only when the standard errors or probable 
errors are obtained from two different groups or if they are obtained 
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from the same group and the measures are uncorrelated. That is, 
when 


T(4—B) = VJ cx? + op? 


or PEwa-s) = PE? + PEs? 
For example, in computing the standard error of the difference of means 
Cm = 20 


Ou, = 14 





1. Since cy, is greater than cy,, the value of oy, will be used as Y. 

2. Find 20, (¢y,), on the Y scale. 

3. Follow this line to the right to the point of intersection with 
fourteen, the value of cy,, found on the X scale. 

4. This intersection is on a curved line or between two curved 
lines. By tracing this curve to the C scale,* a value of about 24.4 is 
found. This is the value of o:y,-y,), the standard error of the differ- 
ence of the means. 

When the values of X and Y are both below ten, multiply both of 
them by ten, for computation and then divide the answer by ten. 
Similarly, when the values of either X or Y are above one hundred, 
divide both of them by ten, for computation and then multiply the 
answer by ten. Suppose now that cy, = 3.6, andoy, = 4.2. Multi- 
plying by ten, we obtain o’y, = 36, and o’y, = 42. In this problem 
we locate o’y, on the Y scale since forty-two is greater than thirty-six. 

1. Find the point forty-two on the Y scale. 

2. Trace this to the right until it intersects the X scale at the value 
of ou, (36). 

3. Interpolating between the curved lines, we find value of o’¢4,—,) 
on the C scale to be about 55.3. 

4. Dividing the obtained value (55.3) by ten, we obtain oy,—w,) 
= §.53. 





* When the answer is less than one hundred, it may be obtained by tracing 
the curved line to either the C scale or the Y scale. 


A STUDY OF EARLY ENTRANCE TO COLLEGE 


Y. SILVERMAN AND VERNON JONES 


Clark University 


THE PROBLEM 


The Bearing of the Study upon Policies of Enrichment and Rapid 
Promotion.—Many arguments have been advanced for and against 
allowing bright students to enter college at an early age. It is inter- 
esting, however, to note that no thorough study has been made of 
the views of a large number of students who actually entered early 
and who can, therefore, speak from experience. The purpose of 
this article is to give some evidence of this type from about five 
hundred students who entered college at 1714 years of age or younger. 

Various observations have been made by educationalists concerning 
the difficulties which stand in the path of a boy or girl who enters 
college at fifteen, sixteen, or seventeen years of age. It is often 
claimed that a student who enters young does not profit by or enjoy 
the social life in college as much as if he had entered at the normal 
age; it is claimed that he does not stand as good a chance to develop 
traits of leadership; it is said that he is at a disadvantage in athletics, 
and in such courses as sociology, economics, and philosophy. Very 
important decisions affecting young bright students are made by 
educators on the basis of opinions in this matter. Not only are 
minimum age requirements set up in many colleges on the basis of 
such opinions, but all plans for dealing with young bright pupils in 
all grades of the elementary and high schools are influenced by the 
attitudes of teachers and school administrators concerning this problem 
of the advisibility of allowing pupils to save time in getting through 
school. There is no doubt that many students can by age fifteen or 
sixteen master the subject-matter necessary for entrance to college, 
but if elementary and high-school teachers are of the opinion that 
such early entrance is unwise they can, through advice or regulation, 
cut down to a very small figure the number who actually do apply for 
entrance at a young age. 

A variety of methods have been devised by elementary and 
and secondary school officials in attempts to adapt school work to 
individual differences among pupils. But in all the plans for adapting 
the work to the needs and capacities of the bright children there is 
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inherent the fundamental question as to whether or not the child will 
be allowed to save time in mastering the knowledge and skill and 
acquiring the attitudes and appreciations required for entrance to 
college or for graduation from the secondary school. In considering 
the plans for dealing with bright children one thinks, of course, of the 
various devices for rapid promotion and the various devices for 
enrichment. But these two are not mutually exclusive. Plans for 
enrichment often contain not only more work on the same level of 
difficulty—horizontal enrichment—but also advanced work which 
may lead to special promotions—vertical enrichment. Rapid pro- 
motion may be looked upon as a form of enrichment where the chief 
emphasis is upon vertical enrichment. Plans for enrichment and for 
rapid promotion are means employed by educators to carry out their 
views on whether a bright child’s time should be employed with doing 
more tasks on the same levels as normal children and progressing 
through school at the normal rate, or whether his time should be used 
in gradually doing more difficult tasks than normal children and 
progressing through school at a somewhat faster rate. The funda- 
mental issue, therefore, is not between enrichment and rapid pro- 
motion, but between saving time and not saving time in mastering 


desired skills, knowledge, and attitudes of the elementary and second- 
ary schools. 


THE PLAN OF PROCEDURE 


The data which are to be reported in this article bear directly 
upon this central issue for those pupils who are going to college. Facts 
will be given not only to show how well those students who have 
actually gone to college at an early age get along in their college 
studies, but also to show how they feel about the advantages and 
disadvantages of entering college young. The materia] will be 
organized around three questions: 

1. What is the relation, if any, between one’s age at entrance to 
college and his view of the optimum age for entrance? 

2. To what extent did students who actually entered college at 
fifteen, sixteen, or seventeen years of age experience difficulties in 
adjusting to the college situation? 

3. Did these young students do as well in scholastic work as those 
entering at normal age? 

Following a preliminary study based on the replies of eighty-four 
teachers, a questionnaire was sent out to a large number of college 
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students and alumni. Replies were received from nine hundred three 
persons. This total number can be divided into four groups: Group I, 
consisting of six hundred sixty-one undergraduate students including 
freshmen, sophomores, juniors, and seniors; Group II, consisting of 
sixty-seven graduate students and college alumni; Group III, con- 
sisting of one hundred five undergraduate members of Phi Beta 
Kappa; Group IV, consisting of seventy alumni members of Phi 
Beta Kappa. The groups will hereafter be referred to by number. 
A copy of the questionnaire used is given below. 


A study is being made of the influence of age at entrance to college upon success 
in college. 


We ask your cooperation and a few minutes of your time to answer the question- 
naire printed below. 


QUESTIONNAIRE 


male female 
(If you are a student, please state your class) 
2. At what age did you enter college?.......... 
3. Would you prefer to have entered at an earlierage?  ...... ...... 


If so, at what age?.......... Yes No 
4. Would you prefer to have entered atalaterage? = 3 ...... ...... 
If so, at what age?.......... Yes No 


5. Would you prefer to have remained in the elementary school 
EE a ee 

If so, would you prefer to have had Yes No 
(a) More intensive work there? = ~~ ...... 


“ee eee 


(b) More extensive work there? i ——s ee ee 
6. Would you prefer to have remained in the high school for a 
ee, wekeee 
If so, would you prefer to have had Yes No 
(a) More intensive work there? oo ~~ \..... 


Yes No 
(b) More extensive work there? si ee te eee 
Yes No 
7. Do you believe that there should be a minimum age for college 
Fe en ee ar ee ie Pea oa ae 
NN SPE re Yes No 


8. Do you believe that you enjoyed your social life in college as 
much as the average student? nw et tee ees 
If not, was this due chiefly to (a) age?............... Yes No 
(b) other factors?....... 


(Please state) 
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9. Do you feel that you had less opportunity than the average 
student to be a leader in college activities? = 3 ...... ..seee 
If so, was this due chiefly to (a) age?.............. Yes No 


(Please state) 
10. Did you participate in athletics? i eee 
If not, was this due chiefly to (a) age?............. Yes No 
(b) other factors?....... 
(Please state) 
11. In the study and discussion of social, economic, sociological, 
and philosophical problems did you feel a lack of experience ...... ...... 
that you attribute to age? Yes No 
12. Further comments on the influence of age at entrance to col- 
lege on success in college will be appreciated. 
Please return this questionnaire in the enclosed addressed envelope. 


RESULTS 


Relation between One’s Age at Entrance to College and His View of 
the Optimum Age.—One of the writers asked eighty-four experienced 
teachers in his courses in tests and measurements to indicate the 
minimum age at which very capable children should be permitted to 
graduate from high school and apply for entrance to college. It 
was found that there was a distinct tendency on the average for those 
teachers who entered college early themselves to indicate lower ages 
than those who entered late. This seemed to be very significant, and 
it was decided to see if this same relation held with other groups differ- 
ing widely in their ages at entrance to college. The facts on this were 
yielded by replies to two main questions: First, the question as to 
whether each individual would prefer to have entered at an earlier 
or a later date; and, second, the question as to his suggestion of a 
minimum age for college entrance. The results are given in Table 
I. Table I shows that the average age at which these subjects would 
like to have entered college steadily increases with the age at which 
they did enter. The group which entered at age 19.5, for example, 
has as its average preferred entrance age 19.01, whereas the group 
entering at 15.5 has as its average preferred entrance age 16.83. 

Distributions have been worked out in a manner similar to that 
given in Table I for each of the groups mentioned on p. 60 separately, 
but no appreciable differences were found among the distributions of 
the different groups. However, the mean preferred entrance age 
indicated by each group will be given in Table II in order to show 
that there are no important differences between the college students 
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and the alumni and none between the Phi Beta Kappa group and the 
non-Phi Beta Kappa group in the age preferred for entrance to college. 


TaBLE I.—A CoMPARISON BETWEEN THE AGE aT WuHIcH NINE HUNDRED THREE 
STUDENTS AND ALUMNI ENTERED COLLEGE, AND THE AGE aT WuicH THEY 
WovuLp PrEeFrer TO Have ENTERED 


























Entrance age N 

Preferred entrance age | bend 
14.5| 15.5) 16.5/17.5| 18.5| 19.5 | 20.5 

ER eee aoe eee 1{ 13| 23) 4 35 | 76 

sc: kee eee 3 | 32 5 | 53 5 | 98 

Re od duns Mia 4; 29; 58; 193 | 15} 25 | 324 

. ees er 2) 17 | 224) 25| 138 3 | 284 

16.5 1 1 79 15 10 a 6-40 108 

- Se ce 6 2: & eee Aer See Y 

14.5 | ae ae biaeags a 4 

Average of preferred entrance 

ages......................-/15.00)16.83 17. 16.17.9218. 51)19.01/19.41 

DES Siw bcos e's oa 4| 13 131 | 343 | 257 | 87 | 68 | 903 

















TasLeE II.—Ace at Wuicu Stupents ENTERED COLLEGE AND THE AVERAGE 
PREFERRED AGE FOR Four Groups SEPARATELY. NUMBERS IN PARENTHESES 
REFER TO NuMBER OF CASES UPON WHICH THE AVERAGES WERE 
CoMPUTED 





Entrance age 





14.5) 15.5) 16.5/ 17.5) 18.5| 19.5 | 20.5 





Group I undergraduates............. 14.50/16. 63)17.30|17. 93,18. 56/18. 93/19. 82 
(1) | (8) | (95) | (263)| (189); (67) | (38) 
Group II graduates and alumni......|..... 16.50|16.61|17.93)18.60)19. 5019.06 


(0) | (2) | (9) | (14) | (21) | () | (16) 
Group III student members of Phi 


os 4 4 0 Sb a6 oe eon 15.17|17.00/16.69)17.86)18.40/19.33)18.50 
(3) | (2) | (16) | (45) | (31); (6) | (2) 
Group IV alumni Phi Beta Kappa...|..... 18.50/16. 86/17. 88|18. 13)19. 06/19. 58 























(0) | (1) | (12) | (21) | (16) | (9) | (42) 





There are two conclusions which can be drawn from the facts 
presented in these two tables. First, if college students or alumni 
are asked at what age they would prefer to have entered college in 
view of their experiences in college, their answers will depend on the 





Se _—— me 





Early Entrance to College 63 


average upon the age at which they did enter, and the older they were 
at entrance, the higher will be the preferred age given. Second, the 
student entering at a late age tends to prefer to have entered a little 
earlier, and the student entering at an early age tends to prefer to 
have entered a little later. There is, in other words, a definite tend- 
ency for the preferred entrance ages given to regress toward the 
entrance age of the average student. The last point seems to show 
that neither the young entrant nor the old entrant was altogether 
satisfied with his entrance age, each preferring to have been a little 
nearer to the average of the group. But the student who entered 
early did not on the average set his preferred entrance age as high 
as the mean entrance age of 18.02, nor did the student who entered 
late set it as low as this. 

A comparison between the age of entrance and the minimum age 
suggested for college entrance brings out facts which fit in well with 
the two conelusions just mentioned. (Or, to say it a little more accu- 
rately, the figures based on that sampling of the population which was 
willing to venture a minimum age figure bear out these facts. Not all 
of the nine hundred three subjects were sure that there should be any 
minimum entrance age. Five hundred fifty-five subjects, or 61.5 per 
cent of the group, said that they thought there should be a mini- 
mum age, and all of these except fourteen indicated what they thought 
this age should be.) 

Table III gives the average minimum entrance age suggested by 
each age group. It will be seen that those who entered college late 
(or had a high entrance age) suggest higher minimum entrance ages 
than do those who themselves entered college early. Let us examine 
the statistics in some detail. 

The ‘‘totals’”’ column shows a steady increase in average minimum 
entrance ages with increases in entrance ages, except for the group 
entering at 15.5, where the average based on five cases is practically 
identical with the minimum age given by the group entering at 16.5. 
This tendency for the minimum ages reported to vary directly with 
entrance age of the reporter was true not only in the total sampling— 
which of course would be largely affected by the results obtained from 
the large sampling of college students in Group I—but also in each 
separate group. The second important fact shown by Table III is 
that the average minimum age suggested by the students and alumni 
who entered very early is higher than their own entrance age, whereas 
the minimum age suggested by those who entered late is lower than 
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their entrance age. The group entering at age 16.5 suggested 16.87 
as a minimum age for entrance, the 17.5 group suggested 17.07, the 
18.5 group suggested 17.75, the 19.5 group suggested 18.16, and the 
20.5 group suggested 18.69. There is, therefore, a clear tendency for 
those reporting to suggest for a minimum age an age which is between 
their own entrance age and one which is about one-half year below the 
modal age. In other words, there is a tendency for the minimum 
age suggested to represent a regression from the suggestor’s own 
entrance age toward approximately 17.0. The lowest ‘‘minimum 
age”’ suggested was fifteen. The significance of this seems to be that 
those who entered college early did not have such experiences as to 
lead them to recommend high minimum ages. Indeed, it is the group 
which entered late which is most concerned about setting the entrance 
age high. It has already been stated that this tendency was found 
among experienced teachers. In a study of eighty-four experienced 
elementary and junior-high-school teachers and principals, 16 per 
cent were found to have entered at age nineteen or older. The 
minimum age for ccllege entrance suggested by this older group was 
1.4 years higher than the age suggested by those entering under 
eighteen years of age. 


TaBLeE III.—A CoMmPpaRISON BETWEEN ENTRANCE AGE AND SUGGESTED MINIMUM 
AGE FoR ENTRANCE TO COLLEGE. DistripuTions GIVEN SEPARATELY FOR 
Eacu GROUP AND FOR THE TOTAL 














Group I Group II | Group III | GroupIV Totals 
Entrance Average Average | Average Average Average 
age No. | minimum) No. | minimum) No. | minimum) No. | minimum) No. | minimum 
cases| entrance | cases, entrance | cases, entrance cases, entrance | cases; entrance 
age age | age age age 
15.5 4 17.00 ID \/asigte ciel 1 16.50 | eee | 5 16.90 
16.5 51 16.93 5 16.30 6 16.67 5 17.10 67 16.87 
17.5 173 17.06 7 17.64 21 17.03 11 16.95 212 17.07 
18.5 120 17.83 10 17.80 13 17.12 6 17.33 149 17.75 
19.5 45 18.30 4 18.00 5 17.70 7 17.64 61 18.16 
20.5 &over| 27 19.12 12 18.33 2 17.00 6 18.00 47 18.69 



































Provisions for bright children to prepare for college at a younger- 
than-average age, and provisions for such children to be permitted to 
enter college after being prepared, are contingent upon the idea 
among teachers and administrators that such saving of time is desira- 
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ble. It is interesting to note, therefore, that the idea of a persen 
concerning the desirability of saving time is related to the amount of 
time which he saved. One can depend upon it, that those persons, 
on the average, who themselves finished high school and entered college 
late will look with disfavor upon any plan whereby bright children 
can save much if any time in preparing for college. Of course, there 
are exceptions, some persons who entered young suggesting high ages 
for college entrance, and some who entered late suggesting some 
rather low ages, but the general trend is unmistakably in the direction 
stated. 

Preference for Use of Extra Time by Those Who Would Prefer to 
Have Entered College Later.—In the case of those early entrants who 
would prefer to have entered at a later date, an attempt was made to 
find out where they would prefer to have spent the extra time. It 
was interesting to find that of the one hundred sixty-one students who 
stated that they would prefer to have entered college later only 22.4 
per cent consider the elementary school as a desirable place to spend 
the extra time. 51.6 per cent stated that they would have spent their 
time in the high school. The remainder, 26.0 per cent, consisted of 
two groups: those who did not answer the question and those who 
stated that they would prefer to have used the extra time in a special 
college-preparatory school or a junior college. This finding seems to 
indicate that the holding of young bright children back in the ele- 
mentary grades when they are prepared to do more advanced work 
cannot be defended on the grounds that a large number of such 
children will later look with approval on this plan. 

At present various plans whereby bright children are given ‘‘ busy 
work” to keep them employed but not allowed to save any time are 
based on certain theories in philosophy of education and methods of 
teaching, but the truth of these theories themselves will have to be 
tested in the light of the degree to which they serve to assist individual 
pupils to adjust adequately to the situations which they meet. The 
fact that only 33 per cent of the four hundred ninety-one students 
entering college at age 1744 or younger signified that they would 
prefer to have spent more time than they did in preparing for college 
and the fact that only 22 per cent of these—7.3 per cent of the total— 
would prefer to have spent any more time than they did in the ele- 
mentary school would seem to indicate very clearly that these students 
who ‘‘have been through the mill” do not feel that the extension of 
their time in the elementary school would have been as serviceable 


66 The Journal of Educational Psychology 


in making for adequate adjustments as an equal amount of time spent 
in more advanced study. 

Success of Young Students in College.—Back in 1913, Dean Holmes 
of Harvard University published an article! in which an analysis was 
made of the records of 5769 boys who entered that institution between 
1902 and 1912 with a view to determine the success of the students who 
entered young. He comes to the following conclusion: ‘‘They [the 
results of this analysis] prove that youth in itself is no bar to a credit- 
able college career. They prove that college conditions do not put 
young men at a fatal disadvantage; they dispose of the vague con- 
viction that college life is too much for the boy of seventeen . . . The 
college may with confidence urge parents to send their boys to college 
young.” However, these results, significant as they were at the 
time they were gathered, were based mainly on a study of class grades 
and of the dean’s records of discipline. For a measure of the actual 
achievement of the young student in college, the results of a thorough- 
going achievement test would be more reliable than class grades, and 
for a measure of adjustment there are several other facts which would 
be desirable in addition to the records of the degree to which the stu- 
dents showed their maladjustment overtly through the breaking of 
college regulations. 

In the present study no attempt was made to measure objectively 
the achievement of students who went through college at a younger- 
than-average age, because the Pennsylvania survey of achievement 
of college seniors yields very reliable results on this problem. All 
that will be done here, therefore, will be to call attention to the results 
in this survey which bear upon the present problem. On the second 
question—that of adjustment—we shall offer evidence bearing on 
whether or not the students entering at various age levels felt that 
they were well adjusted or not. It is, of course, admitted that evi- 
dence on whether or not an individual feels that he is or was malad- 
justed to his environment is not a perfect measure of his adjustment 
or maladjustment. Results based on careful psychiatric study would 
perhaps be better, though at the present stage of measurement in the 
psychiatric field it is not certain that any better single objective meas- 
ure of the maladjustment of college students could be used than the 
report of the subject concerning his own feeling in the matter. It is 





1 Holmes, H. W.: Youth and the Dean. Harvard Graduate Magazine, Vol. 
XXI, 1913, pp. 599-610. 





a a a a a ae 








sr- 
nt 


Its 
nd 
on 
at 
vi- 
.d- 
ant 
uld 
the 
as- 
the 
t is 


Vol. 





Early Entrance to College 67 


true that a college boy may feel that he is adjusted satisfactorily to 
his environment and yet be, in the mind of a psychiatrist, maladjusted; 
but those cases who think they are maladjusted are for this very 
reason, if for no other, pretty sure to be maladjusted, and it is very 
probable that such cases are the most serious ones and the most 
oumerous. 

The Relation between Age and Achievement as Found in the Pennsyl- 
vania Survey.—The College Achievement Test, devised and adminis- 
tered under the sponsorship of the Carnegie Foundation for the 
Advancement of Teaching, was given in the state of Pennsylvania in 
1928 to 4011 college seniors, ranging in age from eighteen to twenty- 
six. The test was very extensive, containing 3400 questions on a 
wide variety of topics ordinarily covered in college. The average 
crude scores obtained by the various age groups are given in Table IV. 
It will be noted that the young seniors did not merely hold their own 
with their older classmates; they led the others in every case. The 
highest average achievement score was made by the eighteen-year-old 
seniors, the next highest by the nineteen-year-olds, and so on, the 
average score decreasing steadily with increases in age up to age 
twenty-five. ' 

Comparison between Younger and Older Entrants in Their Views of 
Their Own Adjustment in College-—Questions 8, 9, 10, and 11 in the 
questionnaire were designed to get the views of a large number of 
college students and graduates on the influence which their age of 
entrance had upon their success in adjusting themselves to various 
college activities. Table V gives the results on these four questions. 
Results are given separately for the group which entered at the age 
of 174% or younger and the group which entered later than 1744; 





1 These data upon the relation of age of reaching the end of the senior year to 
total achievement score are not necessarily the same as the relation between age 
at entrance and total achievement score. However, in the case of the very young 
seniors, their age is indicative of the time they entered, and these are the ones we 
are most concerned about in this study. Moreover, questionnaires were received 
by the writers from fifty-seven of this group, and in every case the entrance age 
could have been accurately predicted by subtracting four or three from the age 
at which they took the test. No doubt some of those graduating late entered 
five or six years earlier and were graduating late due to failures or to the fact that 
they were out of school during their college career, but we have no reason to 
believe that any adjustments made for this fact would materially alter the facts 
of the table so far as they bear upon the problem at hand. 
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there were four hundred ninety-one cases in the former group and 
four hundred twelve in the latter. 


TaBLE IV.—AVERAGE SCORES ON THE COLLEGE ACHIEVEMENT TEST OBTAINED 
BY SENIORS OF VARYING AGES* 











Age at time of testing Number of cases Average score 
18 12 759 
19 55 672 
20 441 622 
21 1,196 567 
22 1,172 563 
23 616 522 
24 297 506. 
25 146 527 
26 76 515 
Total number of cases......... ; 4,011 











*Data from the Survey Sponsored by the Carnegie Foundation for the 
Advancement of Teaching. 


Those who oppose any saving of time on the part of bright pupils 
lay much emphasis upon the danger of such students’ being at a 
disadvantage in college in their social life, in athletics, and in getting 
opportunities for training in leadership. It is interesting to know, 
however, that only four to seven per cent of those who actually went 
through college early feel that they were at any disadvantage in these 
respects due to age. When those who entered below 1714 are com- 
pared with those who entered above 174, there are only two to four 
per cent more of the former than of the latter stating that they were 
at any disadvantage. 

The results on Question 11 are quite different from those on the 
three preceding questions. Roughly one-third of the younger group 
and one-sixth of the older felt that they were at a disadvantage, due 
to age, in the study and discussion of problems of sociology, philosophy, 
and the like. In the younger group five times as many persons felt 
at a disadvantage here as in the situations given in Questions 8, 9, 
and 10; and twice as many younger entrants as older felt at a dis- 
advantage. This difference is striking. It is so large in relation 
to its standard error that it is practically certain, statistically speaking, 
that it could not be attribxted to chance factors. 
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TaBLE V.—THE PERCENTAGE OF THE YOUNGER AND OLDER ENTRANTS REPORTING 


THEMSELVES AS HAVING BEEN AT A DISADVANTAGE IN CoLLEGE Dus To AGE 


























Group enter- Group enter- 
ing at 1744 or ing later than ; 
site 17%. Percent] Differ- ‘Diff * 
etic reporting ence SDaits 
a defect. 
defect. N = 412 
N = 491 
Question 8. Did not enjoy | 
social life as much as aver- 
age due to age............ 5.29 2.18 3.11 2.5 
Question 9. Had less oppor- 
tunity than average to be | 
a leader due to age........ 6.72 2.43 4.29 3.2 
Question 10. Did not par- 
ticipate in athletics due to 
a ee ee De | 4 28 2.43 1.85 1.6 
Question 11. Lacked experi- 
ence necessary for study 
and discussion of social, 
economic, sociological, and 
— — due | 
to age.. ee es ere 32.59 | 16.26 16.33 5.8 
j 
* The formula used for the SD of each percentage was SD, = \ y- The 





formula for the SDarr. was SDast. = ~/SD*», + SD*p,- See: Holsinger, K. J.: 


“Statistical Methods for Students in Education.”” Boston: Ginn, 1928, p. 248. 


All differences which are as large as 2.78 times the SDuirs, are considered statisti- 
cally reliable. 


There seems to be no doubt that the students entering college at 
17% or younger are more likely to report they are at a disadvantage 
in certain courses due to age than are students entering later; however, 
it is not easy to interpret this result. Does it mean that the younger 


students really are at a disadvantage as shown by inferior work done; 
or that they are more clever in appreciating the weaknesses in their 
background; or that they have a higher standard as to what proper 
preparation really is? The first of these possibilities is the only one 
which directly concerns us here, because if these younger students 
accomplish in their classes as much or more than the average, but at 
the same time consider themselves at a disadvantage, it seems clear 
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that this statement is merely reflecting a high standard on their part. 
Early entrance would be serious if this feeling of being at a disadvantage 
were accompanied by facts to show an actual inferiority in performance. 
No objective measures of achievement were available by which these 
possible explanations could be tested out for the sample upon which 
Table V was based. However, the experimenters were fortunate 
in having an opportunity in the Pennsylvania Survey data to deter- 
mine whether the younger students were actually inferior to the older 
ones in measurable achievement. 

With the cooperation of the Carnegie Foundation! it was possible 
to send our questionnaire to a random sampling of the young students 
tested in Pennsylvania. The percentage of young students in this 
sampling stating that they were at a disadvantage agreed very 
well with the figure given for the other sampling in Table V. 
Using this sampling a comparison was made between the average 
achievement score made by the early entrants who said they were 
at a disadvantage and the average score of all students. It was 
found that the former exceeded the latter. Second, a comparison 
was made between the average score of the early entrants who said 
they were at a disadvantage and the early entrants who said they were 
not. Practically no difference was found between the two groups. 
Third, a comparison was made between a random sampling of the 
young seniors (age eighteen to twenty) and a random sampling of the 
older seniors (age twenty-two to twenty-eight) on those particular 
questions of the College Achievement Test which bore upon Social 
Conventions, Religion, Philosophy, Economics, and Politics. There 
were in all four hundred twenty-six questions on these topics. The 
scores on these were specially culled out by the experimenters. The 
average score made by the young group was 84.56, with a SD of 34.02 
(N = 198); while the average score made by the old group was 81.55, 
with a SD of 28.91 (N = 184). 

From these three lines of evidence, we are led to conclude that the 
young entrants are not, on the average, inferior to the older ones in 
actual achievement in subjects dealing with economic, sociological, and 
philosophical problems. It is interesting to find that one-third of the 
young entrants stated that they felt a lack of experience necessary 





1 The writers are indebted to Mr. W. S. Learned of the Carnegie Foundation 
and to Mr. Robert Mendenhall of the statistical staff of the Pennsylvania Survey 
for aid in obtaining these data. 





- - = nr i 


> > Jt) fee 


Ln,lCUMllCC.rlCOF 








\w —S=*_ we 


Ss OS — — OY oso Ow Ww Ww we omnis ew ° 


= 





Early Entrance to College 71 


for the study and discussion of such questions whereas only one-sixth 
of the older entrants expressed such a feeling of deficiency. What 
factors account for this difference, we do not know, though we believe 
that the difference in standard of judgment as to what an adequate 
background would be is an important one. But, regardless of how 
this may be, it seems certain that this feeling of lack of experience 
on the part of the young entrants was not accompanied by inferiority 
in measurable performance in the designated subjects. 


SUMMARY 


1. From a preliminary study of eighty-four teachers it was found 
that, on the average, the age which a given individual considers suitable 
for entrance to college varies directly with his own age of entrance. 
This was further tested out on a population of nine hundred three 
cases, and this tendency was found to be very pronounced. This is 
significant because we should expect from these findings that, other 
things equal, bright children will have greater difficulty in saving 
time in any elementary and high schools where the policies are dictated 
by school officials who themselves entered college relatively late. 

2. Only 7.3 per cent of the four hundred ninety-one cases who 
entered college at a young age (1714 or younger) would prefer to have 
spent any more time than they did in the elementary school. Seven- 
teen per cent would prefer to have spent more time in high school. 
An additional 8.7 per cent includes those,who would prefer to have 
spent extra time in college-preparatory school or junior college and 
those who would prefer to have spent extra time but did not signify 
where. The remaining 67 per cent did not wish that they had spent 
any more time than they did in preparing for college. 

3. From an analysis of the replies of four hundred ninety-one 
persons who entered college at 1744 or younger and of four hundred 
twelve who entered later than 174%, it was found that four to 
seven per cent of the younger group felt that due to age they were at a 
disadvantage in social life, in opportunities for leadership, and in 
athletics, whereas from two to four per cent of the older group felt 
that they were at a disadvantage due to age. 

4. One-third of those who entered college at seventeen or earlier 
stated that they felt that they were at a disadvantage dueto age in 
the study and discussion of social, economic, sociological, and philo- 
sophical problems, whereas only one-sixth of the older group reported 
a consciousness of such a disadvantage. However, a study of achieve- 
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ment of young students on an extensive test does not show any inferior- 
ity of performance on their part. Indeed the younger students made, 
on the average, a higher total score on the College Achievement Test 
than the older students, and they made a higher average score on 
those very parts of the examination which dealt with these specific 
problems. The report from the younger students to the effect that 
they felt at a disadvantage probably represented not an inability to 
profit by the work in sociology, economics, and philosophy as well 
as the average student of normal age, but rather a higher standard 
of the background needed for such courses. 
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Interest and Ability in Reading, by A. I. Gates. New York: The 
Macmillan Co., 1930. Pp. VII + 264. 


The development of a program of research dealing with materials 
and methods of the teaching of reading by Dr. A. I. Gates and his 
associates in the Institute of Educational Research of Teachers College 
has resulted in contributions that may be classified as among the most 
important appearing during the past decade. The present volume 
reports the more recent investigations which show the influence of 
various factors upon both interest and ability in children’s reading. 
As in previous volumes the discussions and conclusions are based on 
experimental findings. 

The four chapters of Part I give the experimental results. Studies 
were conducted to determine the optimum vocabulary burden for 
dull, average and bright children. In general, ‘‘the less apt the pupil, 
the more extensive must be his experience with printed words to insure 
the abundance of fluent reading which is essential to healthy growth 
of interest and ability”’ in reading. Ordinary primary readers intro- 
duce new words too rapidly for even the brightest pupils to master 
without considerable supplementary work of some sort. The number 
of repetitions per word which is provided in the classroom work during 
the first year should vary with intellectual level of the pupil. Gates 
suggests tentative norms ranging from about twenty repetitions on 
the average for pupils with IQ’s of 120-129 to fifty-five for those with 
1Q’s of 60-69. 

Two related studies showed that pupils preferred stories approxi- 
mately twice as frequently as informative material. This was true 
for all levels of intelligence. Further experiments revealed definite 
preferences for various types of material, and that these preferences 
changed with age. Analysis revealed that at least three positive 
factors conditioned children’s interest in reading: Good style (form or 
quality), difficulty, and suitability of content. 
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Each of fourteen literary characteristics were correlated with 
interest. The coefficients ranged from +.35 to —.15 and indicate 
that ‘‘the elements of surprise, liveliness or action, animalness, con- 
versation, humor, and plot are the most potent sources of interest in 
children’s literature. Narrativeness, poeticalness, familiarity, and 
repetition are favorable but rather impotent, especially the last two. 
Fancifulness, realism, verse form, and probably repetition, in and of 
themselves are quite without influence either to increase or decrease 
interest. Morainess tends appreciably to decrease interest. ”’ 

A final experiment revealed that first grade children take a strong 
interest in reading material which leads to some practical activity. 

In most of the experiments investigating interest in children’s read- 
ing, the materials were read to the children by the experimenter. The 
results were then interpreted to apply to reading by the pupils them- 
selves. No evidence was adduced by the author to show that prefer- 
ences for material read aloud by another are the same as preferences 
for material read by the pupil himself. It is possible that a distinc- 
tion should be made between these two procedures until they have 
been shown to yield equivalent results. 

The author’s statements that the pupil is favorably disposed toward 
poetic qualities, and that familiarity has a positive but slight influence 
on interest in the first three grades (p. 83) are apt to be misleading 
to the uncritical reader. Examination of Table XV shows that the 
correlations between interest and these two qualities are only +.07 
and +.06 respectively. 

Part II is devoted to an explanation and illustration of reading 
instruction based on the findingsin Part I. Dr. Gates gives an outline 
of what promises to become one of our most effective method of teach- 
ing reading. The pupil is introduced to a related series of projects 
in which his inclination to engage in varied linguistic, dramatic, artistic, 
constructive, and exploratory activities are given play. The develop- 
ment of reading abilities results from doing the things which the 
series of situations calls for. 

All serious students of reading methods as well as elementary school 
teachers will appreciate the book. It has a pleasing style and contains 
sound procedures. Mies A. TINKER. 

University of Minnesota. 
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Educational Measurement in the Elementary Grades, by I. N. Madsen. 
Yonkers-on-Hudson: World Book Co., 1930. Pp. X + 294. 


The adjectives which best describe this volume are sound and 
‘“‘workmanlike.”’ It isa tidy, well-arranged volume, and any informa- 
tion that is expected seems to occur in its appropriate place. Designed 
as an introductory text in Measurement for normal school students, 
it seems at times to rise beyond those modest limits. This, however, 
is a fault of the right kind. It is astonishing to find how quickly what 
was abstruse to one generation becomes the commonplace of the next. 
W. H. Kilpatrick and the reviewer once spent a whole evening in 
1908 learning about a mysterious thing called a median and how to 
calculate it. I think in self-defence that I ought to say we learned it 
thoroughly; in Morrisonian language, it became an adaptation. 
Knowledge of relativity in another generation will be the common 
property of high school students. So a text that is slightly above 
rather than slightly below the level of the students for whom it is 
designed is to be commended. 

There are two chapters on Individual Differences; two on Statis- 
tical Methods; two on Intelligence Tests; two on Achievement Tests 
and one each on Meaning of Scores; Educational Uses of Standardized 
Tests; and Improvement of Teachers’ Examinations. The bibli- 
ographies are full and up-to-date. The reviewer enjoyed least the 
chapter on ‘‘Improvement of Teachers’ Examinations’; it did not 
seem to rise to the high level of the others. A good book of which 
the author may be justly proud. PETER SANDIFORD. 

University of Toronto. 





Enriching the Curriculum for Gifted Children, by W. J. Osburn and 
Ben J. Rohan. New York: The Macmillan Co., 1931. Pp. 
XIV + 408. 


In the more progressive sections of the United States, schoolmen 
have taken one step beyond the question, ‘‘Should we make educa- 
tional adjustments for the gifted child?”’ They are now asking them- 
selves, ‘‘What shall those adjustments be?”’ Osburn and Rohan 
give their answer by offering a detailed plan for enrichment without 
segregation. In theory, the reviewer strongly favors segregation, 
but it is obvious that enrichment for the bright while they remain 
in heterogeneous classes contains far less dynamite. 
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The kind of enrichment recommended by Osburn and Rohan is a 
group of carefully planned and well-directed extra-curricular activities. 
To be more specific, the gifted child is expected to find an outlet for 
his intellectual curiosity in clubs organized as a result of the expressed 
desire of a number of children to learn more about a certain field. 
For example, there might be a radio club; the members would consider 
topics growing out of their common interest. The development of 
the radio, the lives of inventors, and the radio as a civilizing agency 
are but a few. 

The book is most worth while. It comes at a time when a number 
of ;teachers, awake to the needs of gifted children, are looking for 
specific help. To them the materials presented by Osburn and Rohan 
are heartily recommended. Their method is a compromise, but 
undoubtedly a necessary one at this point in the growth of a realization 
on the part of educators that gifted children can not take care of 
themselves. HERBERT A. CARROLL. 

University of Minnesota. 





The Art of Study, by C. H. Pear. New York: G. P. Dutton & Co., 
1931. Pp. X + 117. 


This volume, which purports to tell how to study, may be compared 
in a sense to bread made without flour. The author apparently forgot 
to put in the methods which would furnish the master key to success 
and efficiency in intellectual work. 

He speaks of study as an art, and thereby more or less implies 
that it is a field for the genius rather than for an ordinary person who 
would learn it as a trade. If we may judge by the author’s treatment 
of the subject, success in study will come to the person who has a 
natural gift and who, by chance, gets the necessary inspiration, rather 
than to the one who diligently watches his movements and studies 
the steps which he takes in learning, with the aim of eliminating the 
unsuccessful ones and improving the successful ones. 

In twelve chapters of scintillating and brilliant essay, a master of 
the English language presents interesting ideas more or less loosely 
related to the general field of learning, study, and getting on in the 
world. The treatment is interesting and full of concrete illustrations. 
It is occasionally difficult to tell what the illustrations are intended to 
illustrate, and at times the reader is inclined to believe they were 
included mainly because they were interesting. Perhaps this is due 
to the fact that the volume is an outgrowth of radio lectures given by 
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the author, and the consequent necessity of ‘‘jazzing up”’ the presenta- 
tion to keep the listeners from tuning out. In this regard he has 
succeeded well for the volume is delightful reading. 

The author is British, and is a Professor of Psychology in the 
University of Manchester. Perhaps these two facts explain in part 
the lack of practical application that is found in the volume. Appar- 
ently, there has not been as high a degree of specialization in research 
and writing in the field of teacher-training in England as in America. 
General psychology courses are more likely to be offered instead of 
courses in methods of teaching. More attention is given to the science 
of psychology than to the technique and practice of teaching and learn- 
ing. Furthermore, the interest of the psychologist, whether in Eng- 
land or America, is centered usually more upon the theory or principles 
of learning than upon the concrete practice and technique of it. 

As an interesting, if somewhat aimless, collection of essays, the 
volume will be delightful reading. As a basis for definite instruction 
in how to study, or as a text in a how-to-study course, either in high 
school or college, the volume has little to offer. C.C. CRAWForRD. 

University of Southern California. 





Adolescent Education, by Frederick Elmer Bolton. New York: The 
Macmillan Co., 1931. Pp. XV + 506. 


This book gives almost the impression of a series of chapel talks 
by a wise, benevolent and friendly old gentleman. It is not quite 
that, of course, for there are ten tables on the distribution of the IQ, 
nine on the statistics of physical growth, and seven of criminal statis- 
tics. A considerable portion of the book is given over to quotation 
and G. Stanley Hall retains the lion’s share of attention both in direct 
quotation and in point of view. On the first page we read: ‘“‘We 
usually think of the schoolboy being pushed or pulled to school, but 
in reality has it not been the schoolboy—and the schoolgirl—who 
have demanded the school as a means of self-revelation, a means of 
penetrating the great mysteries of life, a means of satisfying innate 
strivings and ambitions surging within them?” In discussing the 
pre-adolescent period, hoary and unsubstantiated generalizations are 
made about the especial keenness of the senses during childhood, the 
dominance of mechanical memory, the absence of social feeling, the 
suggestion that now as at no other time in life can a foreign language 
be easily learned. No reference is made to the careful studies of 
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imitation but such statements as ‘‘The child imitates automatically, 
spontaneously, and blindly whatever happens to be dominant in his 
environment’’ can be found. One is prepared for the final chapter 
on ‘‘Character Education” with its insistence that no honest minded 
individual could have any doubt about the real meaning of morality 
and its confidence that ‘‘The law of ideo motor action (viz., that 
every idea which gains lodgment in the mind tends to express itself 
in action) is sufficient warrant to urge memorizing gems of poetry, 
proverbs, and beautiful, uplifting sayings.” Goopwin WaTSON. 
Teachers College, Columbia University. 





Learning a New Language, by C. C. Crawford and E. M. Leitzell. 
Published by C. C. Crawford, University of Southern California, 
Los Angeles, California, 1930. Pp. XIII + 242. 


The book is written avowedly from the standpoint of the learner 
and aims to provide the student with the proper technique of study. 
It is, of course, quite impossible to write such a book without involving 
problems of teaching for, in the last analysis, the choice of the method 
and of the material to be learned depends on the teacher; but it is our 
opinion that a great deal more emphasis is laid on the teacher’s 
problems and method of teaching than the title of the book would 
walrant. 

The book deals one by one with the study of the different aspects 
of the language, such as pronunciation, Vocabulary, Spelling, Under- 
standing, Speaking, Reading, Writing, Translation and Grammar. 
In addition there is a chapter dealing with special problems in the 
study of Latin and an excellent chapter on Language Clubs and 
Games. 

There is nothing startingly new in the book, much of it has already 
been said. The book has, however, the advantage of presenting the 
material in a compact form. It is clear, very readable, with an 
abundance of practical devices and suggestions coming directly from 
class room experiences which should prove very useful. 

To the reviewer however, the books lacks a firm psychological 
background. A book on the technique of study cannot afford to 
ignore the fundamental laws of learning and their special application 
to the learning of a language. No mention is made of the wealth of 
experimental research which has been done in the field of learning; 
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there are no references to the psychologists. Expressions such as, 
“Tt has been found”. . . “‘many authorities agree” . . . will hardly 
be considered authoritative references. The consequence is that 
many assertions are made which appear somewhat dubious in the 
light of experimental evidence. 

Although the statement found in the preface, that the book “‘has 
made a definite contribution to the solution of the problems of foreign 
study and teaching,” might justly be questioned, it is true that it may 
prove a useful addition to the literature on the subject and may be 
consulted with profit by both students and teachers. 


Louise C. SEIBERT. 
Goucher College. 





Educational Achievement in Relation to Intelligence, by C. W. St. John. 
Cambridge: Harvard University Press, 1930. Pages XIV + 219. 


This is the fifteenth of the Harvard Studies in Education. Despite 
the fact that Part I, consisting of sixty-one pages, is devoted to a 
theoretical discussion of education, to intelligence and achievement 
tests, and to a summary of previous researches in the field; and that 
Part IV, dealing with educational organization and method, runs to 
another sixteen pages, the volume is a distinctly important one. 

The subjects were five hundred three boys and four hundred 
fifty-five girls in the public schools of a residential suburb of 
Boston. They were mostly of North European stock (79.3 per 
cent); the rest were of Italian (12.9 per cent), South European, Negro, 
Jewish, and of Asiatic and other ancestry. Records of each child 
extending roughly over four years from the first to the fifth grade 
were obtained. These records, obtained from Dearborn’s Harvard 
Growth Study and from cumulative record cards of the school system, 
included two or more Stanford Binet examinations; several group 
tests of intelligence—Dearborn, Otis Primary and others; Haggerty’s 
Reading Test sigma 1; Ayres’ Reading Scale; Peet-Dearborn Progress 
Tests in Arithmetic, Grades I-VI; and teachers’ marks, in the higher 
grades at least, for reading, language, spelling, writing, arithmetic, 
history, geography, drawing, music, conduct and effort. 

The statistical treatment of the data, with the possible exceptions 
of the use of coarse groupings, seems to be adequate. The evil spirit— 
variability of the teachers’ marks—was exorcised by use of percentile 
rankings and of sigma deviations. All the criteria of intelligence and 
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achievement were expressed in terms of sigma values. Instead of 
factoring out age by means of the partial correlation technique, the 
author got rid of the difficulty by including only data of pupils who 
were at normal age in the grade. This appears to be just as dubious 
a technique as partialling out the age factor since it resulted in a 
special selection of the subjects. 

The main findings were “that a positive correlation exists among 
all the criteria of intelligence and of educational achievement. The 
correlations between IQ’s and the tests and marks in the subjects of 
study are ‘marked,’ those between IQ’s and marks in effort are ‘low,’ 
and those between IQ’s and marks in conduct are very low if not 
negligible. The intercorrelations of the various criteria of achieve- 
ment in the subjects of study are either ‘marked’ or ‘high.’ The 
correlations between marks in conduct and the criteria of achievement 
in the subjects of study are ‘low,’ but those between marks in effort 
and the same criteria are ‘marked,’ as are all other intercorrelations 
of teachers’ marks. The marks seem to give evidence of the ‘halo 
effect,’ probably ‘radiating’ chiefly from characteristics of personality 
and behavior of the pupils.” 

The author’s interpretation of his findings is that ‘‘the special 
abilities which children acquire in school tend to be of the same order 
of merit as their total of abilities (intelligence), but that very many 
pupils, being exceptions to this rule, develop distinctly more, or less, 
of the school abilities than of other abilities.”’ All of which we sus- 
pected and we are glad to have our suspicions confirmed. 

The book is written in a clear and simple style although we do not 
like the author’s use of ‘‘none are” (pages 123 and 150). A valuable 
study, especially for the research workers in the field of measurement. 

PETER SANDIFORD. 


University of Toronto. 





